我通过设置BLAS(来自环境变量在原来的问题中给出的示例代码解决了这个问题该链接)。
但这不是我实际问题的答案。 我的第一次尝试(第二次更新)是错误的。我需要
在导入numpy库之前而不是在库(IncrementalPCA)导入numpy之前 设置线程数。
那么,示例代码中的问题是什么?这不是一个实际问题,而是numpy库使用的BLAS库的功能。尝试用多处理库限制它不起作用,因为默认情况下,OpenBLAS设置为使用所有可用线程。
import osos.environ["OMP_NUM_THREADS"] = "1" # export OMP_NUM_THREADS=1os.environ["OPENBLAS_NUM_THREADS"] = "1" # export OPENBLAS_NUM_THREADS=1os.environ["MKL_NUM_THREADS"] = "1" # export MKL_NUM_THREADS=1os.environ["VECLIB_MAXIMUM_THREADS"] = "1" # export VECLIB_MAXIMUM_THREADS=1os.environ["NUMEXPR_NUM_THREADS"] = "1" # export NUMEXPR_NUM_THREADS=1from sklearn.datasets import load_digitsfrom sklearn.decomposition import IncrementalPCAimport numpy as npX, _ = load_digits(return_X_y=True)#Copy-paste and increase the size of the dataset to see the behavior at htop.for _ in range(8): X = np.vstack((X, X))print(X.shape)transformer = IncrementalPCA(n_components=7, batch_size=200)transformer.partial_fit(X[:100, :])X_transformed = transformer.fit_transform(X)print(X_transformed.shape)
但是您可以通过检查numpy构建使用哪个环境来显式设置正确的BLAS环境,如下所示:
>>>import numpy as np>>>np.__config__.show()
获得这些结果…
blas_mkl_info: NOT AVAILABLEblis_info: NOT AVAILABLEopenblas_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]blas_opt_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]lapack_mkl_info: NOT AVAILABLEopenblas_lapack_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]lapack_opt_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]…意味着我的Numpy版本使用了OpenBLAS。 我需要编写的只是
os.environ["OPENBLAS_NUM_THREADS"] ="2"为了限制numpy库对线程的使用。



