You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Considering a generated code by structtensor similar to the one below:
void fn(double * f, double * w, double ** X, int L, int C) {
...
for (int j = 0; j < C; ++j)
for (int i = 0; i < L; ++i)
f[j] += (w[i] * X[i][j]);
...
}
The code will access X[i] where will find a pointer to the location corresponding to X[i][j]. The access to memory will increase according to the dimension of the X datastructure impacting the overall performance.
Would be more efficient if the generated code looked like:
void fn(double * f, double * w, double * X, int L, int C) {
...
for (int j = 0; j < C; ++j)
for (int i = 0; i < L; ++i)
f[j] += (w[i] * X[i * C + j]);
...
}
Flattening multidimensional variable would reduce access to memory to a single access per datastructure and improve the overall performance of the core computation.
In addition, this feature would also follow external data manipulation libraries standards, consequentially being easier to integrate structtensor on stabilished pipelines, ie: numpy.ndarray.
I think this is worth doing, since it will add usability and performance. It could even be an optional feature that can be specified as a command line parameter. What do you think?
The text was updated successfully, but these errors were encountered:
Considering a generated code by structtensor similar to the one below:
The code will access
X[i]
where will find a pointer to the location corresponding toX[i][j]
. The access to memory will increase according to the dimension of theX
datastructure impacting the overall performance.Would be more efficient if the generated code looked like:
Flattening multidimensional variable would reduce access to memory to a single access per datastructure and improve the overall performance of the core computation.
In addition, this feature would also follow external data manipulation libraries standards, consequentially being easier to integrate structtensor on stabilished pipelines, ie: numpy.ndarray.
I think this is worth doing, since it will add usability and performance. It could even be an optional feature that can be specified as a command line parameter. What do you think?
The text was updated successfully, but these errors were encountered: