Was ist der schnellste Weg, um Indizes für den Zugriff auf das rechte Dreieck der Hälfte eines quadratischen Arrays zu g

Anonymous · Post by **Anonymous** » 11 Apr 2025, 13:17

Bei einem 2D -Numpy -Array mit gleicher Höhe und Breite sind die Höhe und Breite garantiert gleich, und so ist das Array ein Quadrat. /> Ich habe Funktionen implementiert, die viel effizienter als numpy.tril_indices < /code> aber ich denke, sie sind immer noch ineffizient.

Code: Select all

In [213]: sqr = square(6)

In [214]: x, y = triangle_half_UR_LL(6)

In [215]: sqr[x, y] = 0

In [216]: sqr
Out[216]:
array([[ 0,  0,  0,  0,  0,  0],
[ 6,  0,  0,  0,  0,  0],
[12, 13,  0,  0,  0,  0],
[18, 19, 20,  0,  0,  0],
[24, 25, 26, 27,  0,  0],
[30, 31, 32, 33, 34,  0]])

In [217]: sqr = square(6)

In [218]: sqr[y, x] = 0

In [219]: sqr
Out[219]:
array([[ 0,  1,  2,  3,  4,  5],
[ 0,  0,  8,  9, 10, 11],
[ 0,  0,  0, 15, 16, 17],
[ 0,  0,  0,  0, 22, 23],
[ 0,  0,  0,  0,  0, 29],
[ 0,  0,  0,  0,  0,  0]])

In [220]: sqr = square(6)

In [221]: x, y = triangle_half_UL(6)

In [222]: sqr[x, y] = 0

In [223]: sqr
Out[223]:
array([[ 0,  0,  0,  0,  0,  0],
[ 0,  0,  0,  0,  0, 11],
[ 0,  0,  0,  0, 16, 17],
[ 0,  0,  0, 21, 22, 23],
[ 0,  0, 26, 27, 28, 29],
[ 0, 31, 32, 33, 34, 35]])

In [224]: x, y = triangle_half_LR(6)

In [225]: sqr = square(6)

In [226]: sqr[x, y] = 0

In [227]: sqr
Out[227]:
array([[ 0,  1,  2,  3,  4,  0],
[ 6,  7,  8,  9,  0,  0],
[12, 13, 14,  0,  0,  0],
[18, 19,  0,  0,  0,  0],
[24,  0,  0,  0,  0,  0],
[ 0,  0,  0,  0,  0,  0]])
< /code>

 Code < /h2>
import numpy as np
import numba as nb

@nb.njit(cache=True, parallel=True, nogil=True)
def triangle_half_UR_LL(size: int, swap: bool = False) -> tuple[np.ndarray, np.ndarray]:
total = (size + 1) * size // 2
x_coords = np.full(total, 0, dtype=np.uint16)
y_coords = np.full(total, 0, dtype=np.uint16)
offset = 0
for i in nb.prange(size):
offset = i * size - (i - 1) * i // 2
end = offset + size - i
x_coords[offset:end] = i
y_coords[offset:end] = np.arange(i, size, dtype=np.uint16)

return (x_coords, y_coords) if not swap else (y_coords, x_coords)

@nb.njit(cache=True, parallel=True, nogil=True)
def triangle_half_UL(size: int) -> tuple[np.ndarray, np.ndarray]:
total = (size + 1) * size // 2
x_coords = np.full(total, 0, dtype=np.uint16)
y_coords = np.full(total, 0, dtype=np.uint16)
offset = 0
for i in nb.prange(size):
offset = i * size - (i - 1) * i // 2
end = offset + size - i
x_coords[offset:end] = i
y_coords[offset:end] = np.arange(size - i, dtype=np.uint16)

return (x_coords, y_coords)

@nb.njit(cache=True, parallel=True, nogil=True)
def triangle_half_LR(size: int) -> tuple[np.ndarray, np.ndarray]:
total = (size + 1) * size // 2
x_coords = np.full(total, 0, dtype=np.uint16)
y_coords = np.full(total, 0, dtype=np.uint16)
offset = 0
last = size - 1
for i in nb.prange(size):
offset = (i + 1) * i // 2
end = offset + i + 1
x_coords[offset:end] = i
y_coords[offset:end] = np.arange(last - i, size, dtype=np.uint16)

return (x_coords, y_coords)

def square(length: int) -> np.ndarray:
return np.arange(length**2).reshape((length, length))
< /code>
Nach meinen Tests können das gleiche Paar von Indizes verwendet werden, um auf die obere rechte und untere linke Hälfte zuzugreifen. Wir müssen nur X tauschen, y, um die andere Hälfte zu erhalten. Aber es ist nicht der Fall bei oberen linken und unteren rechten Hälften, und ich kann sie nicht zu einer einzigen Funktion vereinen, ich brauche drei davon.In [228]: %timeit triangle_half_UR_LL(1000)
245 μs ± 3.69 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [229]: %timeit triangle_half_UL(1000)
199 μs ± 1.35 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [230]: %timeit triangle_half_LR(1000)
250 μs ± 1.48 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [231]: %timeit np.tril_indices(1000)
3.09 ms ± 33.2 μs per loop (mean ± std. dev.  of 7 runs, 100 loops each)
< /code>
und Ja, ich verwende Windows 11, was die Leistung von Numpy beeinflusst.xenig@Eliza MINGW64 ~
# which ipython
/mingw64/bin/ipython

xenig@Eliza MINGW64 ~
# ipython
Python 3.12.9 (main, Mar 23 2025, 23:40:57)  [GCC 14.2.0 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.32.0 -- An enhanced Interactive Python. Type '?' for help.
...
In [8]: %timeit triangle_half_LR(1000)
346 μs ± 2.24 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [9]: %timeit triangle_half_UR_LL(1000)
363 μs ± 2.71 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [10]: %timeit triangle_half_UL(1000)
326 μs ± 4.38 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [11]: %timeit np.tril_indices(1000)
3.51 ms ± 31.2 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
< /code>
Normalerweise ist Python auf MSYS2 schneller als Python an Fenstern, und Python auf Clang64 ist schneller als Mingw64 Python.xenig@Eliza CLANG64 ~
# where ipython
C:\msys64\clang64\bin\ipython.exe

xenig@Eliza CLANG64 ~
# ipython
Python 3.12.9 (main, Mar 24 2025, 05:59:42)  [GCC UCRT Clang 20.1.1 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.32.0 -- An enhanced Interactive Python. Type '?' for help.
...
In [6]: %timeit triangle_half_UR_LL(1000)
1.05 ms ± 9.86 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [7]: %timeit triangle_half_UL(1000)
1 ms ± 7.99 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [8]: %timeit triangle_half_LR(1000)
1.06 ms ± 9.45 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

In [9]: %timeit np.tril_indices(1000)
3.27 ms ± 43.2 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Die obigen Infos sollten off-topic sein, aber der Punkt ist numpy.tril_indices ist sehr ineffizient.
Können wir es besser machen?

Was ist der schnellste Weg, um Indizes für den Zugriff auf das rechte Dreieck der Hälfte eines quadratischen Arrays zu g

Was ist der schnellste Weg, um Indizes für den Zugriff auf das rechte Dreieck der Hälfte eines quadratischen Arrays zu g ⇐ Python

Quick Reply