除了SQLite之外,嵌入式数据存储区还没有针对并发访问的优化,我也对SQLite的并发性能感到好奇,所以我做了一个基准测试:
import timeimport sqlite3import osimport randomimport sysimport multiprocessingclass Store(): def __init__(self, filename='kv.db'): self.conn = sqlite3.connect(filename, timeout=60) self.conn.execute('pragma journal_mode=wal') self.conn.execute('create table if not exists "kv" (key integer primary key, value integer) without rowid') self.conn.commit() def get(self, key): item = self.conn.execute('select value from "kv" where key=?', (key,)) if item: return next(item)[0] def set(self, key, value): self.conn.execute('replace into "kv" (key, value) values (?,?)', (key, value)) self.conn.commit()def worker(n): d = [random.randint(0, 1<<31) for _ in range(n)] s = Store() for i in d: s.set(i, i) random.shuffle(d) for i in d: s.get(i)def test(c): n = 5000 start = time.time() ps = [] for _ in range(c): p = multiprocessing.Process(target=worker, args=(n,)) p.start() ps.append(p) while any(p.is_alive() for p in ps): time.sleep(0.01) cost = time.time() - start print(f'{c:<10d}t{cost:<7.2f}t{n/cost:<20.2f}t{n*c/cost:<14.2f}')def main(): print(f'concurrencyttime(s)tpre process TPS(r/s)ttotal TPS(r/s)') for c in range(1, 9): test(c)if __name__ == '__main__': main()我的4核macOS盒上的结果是SSD容量:
concurrency time(s) pre process TPS(r/s) total TPS(r/s)10.65 7638.43 7638.4321.30 3854.69 7709.3831.83 2729.32 8187.9742.43 2055.25 8221.0153.07 1629.35 8146.7463.87 1290.63 7743.7874.80 1041.73 7292.1385.37 931.27 7450.15
8核Windows Server 2012云服务器上的结果,SSD卷:
concurrency time(s) pre process TPS(r/s) total TPS(r/s)1 4.12 1212.14 1212.142 7.87 634.93 1269.873 14.06 355.56 1066.694 15.84 315.59 1262.355 20.19 247.68 1238.416 24.52 203.96 1223.737 29.94 167.02 1169.128 34.98 142.92 1143.39
事实证明,无论并发性如何,总体吞吐量都是一致的,并且Windows上的SQLite比macOS慢,希望这会有所帮助。
由于SQLite写锁定是数据库明智的,因此为了获得更多的TPS,您可以将数据分区到多数据库文件中:
class MultiDBStore(): def __init__(self, buckets=5): self.buckets = buckets self.conns = [] for n in range(buckets): conn = sqlite3.connect(f'kv_{n}.db', timeout=60) conn.execute('pragma journal_mode=wal') conn.execute('create table if not exists "kv" (key integer primary key, value integer) without rowid') conn.commit() self.conns.append(conn) def _get_conn(self, key): assert isinstance(key, int) return self.conns[key % self.buckets] def get(self, key): item = self._get_conn(key).execute('select value from "kv" where key=?', (key,)) if item: return next(item)[0] def set(self, key, value): conn = self._get_conn(key) conn.execute('replace into "kv" (key, value) values (?,?)', (key, value)) conn.commit()在我的Mac上有20个分区的结果:
concurrency time(s) pre process TPS(r/s) total TPS(r/s)12.07 4837.17 4837.1722.51 3980.58 7961.1733.28 3047.68 9143.0344.02 2486.76 9947.0454.44 2249.94 11249.7164.76 2101.26 12607.5875.25 1903.69 13325.8285.71 1752.46 14019.70
总TPS高于单个数据库文件。



