As match phases in OPS5-type production systems require most of execution time and memory access, we consider that distributed memory parallel computers (multicomputers) are suitable for reducing such a bottleneck. %To solve the bottleneck, We proposed a hash-based parallel production systems, CPPS (Clustered Parallel Production Systems), based on the RETE algorithm for the multicomputers. In this paper, we introduce software cache techniques to memory nodes in the CPPS as one of the optimizations, and implement it on nCUBE2. The result shows that the CPPS with the software cache is about 2-fold faster than the original, and more than 7-fold faster than the simple hash method proposed by Acharya et al. for a large scale problem because of decreasing much communication costs.