直接在Scipy稀疏矩阵上使用Intel MKL库来计算内存较少的A点AT

查看scipy稀疏产品的Python代码。注意，它在2次调用中调用了已编译的代码。

看起来mkl代码做同样的事情

https://software.intel.com/zh-CN/node/468640

如果request = 1，则例程仅计算长度为m + 1的数组ic的值，该数组的内存必须事先分配。退出时，值ic（m +
1）-1是数组c和jc中元素的实际数量。
如果request = 2，则例程先前已使用参数request = 1进行了调用，输出数组jc和c在调用程序中分配，并且它们的长度至少为ic（m +
1）-1。

首先

ic

根据行的数量进行分配

（从输入中得知），然后使用调用mkl代码

request=1

。

因为

request=2

您必须根据中的大小分配

和

jc

数组

ic(m+1) - 1

。这

nnz

与输入数组中的数目不同。

您正在使用

request1 = c_int(0)

，这要求

数组的大小正确-在没有实际执行

dot

（或a

request 1

）的情况下是未知的。

==================

File:        /usr/lib/python3/dist-packages/scipy/sparse/compressed.pyDefinition:  A._mul_sparse_matrix(self, other)

传递1分配

indptr

（音符大小），并传递指针（此传递中的数据无关紧要）

    indptr = np.empty(major_axis + 1, dtype=idx_dtype)    fn = getattr(_sparsetools, self.format + '_matmat_pass1')    fn(M, N,       np.asarray(self.indptr, dtype=idx_dtype),       np.asarray(self.indices, dtype=idx_dtype),       np.asarray(other.indptr, dtype=idx_dtype),       np.asarray(other.indices, dtype=idx_dtype),       indptr)    nnz = indptr[-1]

传递2分配

indptr

（不同大小），并基于

nnz

indices

和

data

。

    indptr = np.asarray(indptr, dtype=idx_dtype)    indices = np.empty(nnz, dtype=idx_dtype)    data = np.empty(nnz, dtype=upcast(self.dtype, other.dtype))    fn = getattr(_sparsetools, self.format + '_matmat_pass2')    fn(M, N, np.asarray(self.indptr, dtype=idx_dtype),       np.asarray(self.indices, dtype=idx_dtype),       self.data,       np.asarray(other.indptr, dtype=idx_dtype),       np.asarray(other.indices, dtype=idx_dtype),       other.data,       indptr, indices, data)

最后使用这些数组制作一个新数组。

    return self.__class__((data,indices,indptr),shape=(M,N))

该

mkl

库应以相同的方式使用。

===================

https://github.com/scipy/scipy/blob/master/scipy/sparse/sparsetools/csr.h

有C代码用于

csr_matmat_pass1

和

csr_matmat_pass2

====================

如果有帮助，这里是这些传递的纯Python实现。文字转换，无需尝试利用任何数组操作。

def pass1(A, B):    nrow,ncol=A.shape    Aptr=A.indptr    Bptr=B.indptr    Cp=np.zeros(nrow+1,int)    mask=np.zeros(ncol,int)-1    nnz=0    for i in range(nrow):        row_nnz=0        for jj in range(Aptr[i],Aptr[i+1]): j=A.indices[jj] for kk in range(Bptr[j],Bptr[j+1]):     k=B.indices[kk]     if mask[k]!=i:         mask[k]=i         row_nnz += 1        nnz += row_nnz        Cp[i+1]=nnz    return Cpdef pass2(A,B,Cnnz):    nrow,ncol=A.shape    Ap,Aj,Ax=A.indptr,A.indices,A.data    Bp,Bj,Bx=B.indptr,B.indices,B.data    next = np.zeros(ncol,int)-1    sums = np.zeros(ncol,A.dtype)    Cp=np.zeros(nrow+1,int)    Cj=np.zeros(Cnnz,int)    Cx=np.zeros(Cnnz,A.dtype)    nnz = 0    for i in range(nrow):        head = -2        length = 0        for jj in range(Ap[i], Ap[i+1]): j, v = Aj[jj], Ax[jj] for kk in range(Bp[j], Bp[j+1]):     k = Bj[kk]     sums[k] += v*Bx[kk]     if (next[k]==-1):         next[k], head=head, k         length += 1        print(i,sums, next)        for _ in range(length): if sums[head] !=0:     Cj[nnz], Cx[nnz] = head, sums[head]     nnz += 1 temp = head head = next[head] next[temp], sums[temp] = -1, 0        Cp[i+1] = nnz     return Cp, Cj, Cxdef pass0(A,B):    Cp = pass1(A,B)    nnz = Cp[-1]    Cp,Cj,Cx=pass2(A,B,nnz)    N,M=A.shape[0], B.shape[1]    C=sparse.csr_matrix((Cx, Cj, Cp), shape=(N,M))    return C

无论

和

必须

csr

。它使用需要转换的转置，例如

B = A.T.tocsr()

。

直接在Scipy稀疏矩阵上使用Intel MKL库来计算内存较少的A点AT

面试问答相关栏目本月热门文章