Implementation of Low Cost Memory Subsystem for Low-end IoT Devices
AUTHORS
Jonghee M. Youn,Computer Engineering, Yeungnam Univ., South Korea
Doosan Cho,Electrical & Electronic Engineering, Sunchon National Univ., South Korea
ABSTRACT
The increasingly popular IoT devices and cloud computing devices are being developed in various models from high to low price, but the low-cost market is still growing more actively. In these devices, where internet communication is a key feature, the most expensive components are memory and screen panels. Currently, screen panels are limited in LCD and OLED technology, so the choice is small, but memory includes flash memory, hard disk, DRAM, SRAM, SDRAM, multi-bank memory, and on-chip memory. Therefore, each type is selected and configured according to requirements such as function, power consumption, performance, and cost. The choice of memory architecture available for low-cost IoT devices is quite limited, with a small configuration of SRAM and some flash memory or DRAM. In the case of hard real-time IoT devices, it is very difficult to meet the deadlines in such a memory structure, and developers apply various system optimizations to solve them. Normally, multibank DRAM is selected at the hardware design stage. Parallel access to as many bank memories as possible in the same space can significantly improve system performance. If the hardware is selected as multi-bank memory, there must be system software to support it. In other words, a compiler must be provided to generate program code for parallel memory access. This is because traditional compilers generate program code for sequential access. In this paper, we propose a parallel memory access program code generation method for multi-bank memory support of low-cost IoT devices. The proposed method solves the data placement problem for multi-bank memory and maximizes system performance by actively using multi-bank memory.
KEYWORDS
Energy consumption, IoT system, Heterogeneous memory system, Load/store data dependence graph, Compiler technique, System optimization
REFERENCES
[1] D.B. Powell, E.A. Lee, and W.C. Newman, “Direct synthesis of optimized DSP assembly code from signal flow block diagrams,” In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ASSP), vol.5, pp.553-556, (1992) DOI: 10.1109/ICASSP.1992.226560(CrossRef)(Google Scholar)
[2] M.A.R. Saghir, P. Chow, and C.G. Lee, “Automatic data partitioning for HLL DSP compilers,” In Proceedings of the 6th International Conference on Signal Processing Applications and Technology, pp.866-871, (1995)
[3] M.A.R. Saghir, P. Chow, and C.G. Lee, “Exploiting dual data-memory banks in digital signal processor,” In ACM SIGOPS Operating Systems Review, Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, vol.30, no.5, pp.234-243, (1996) DOI: 10.1145/248208.237193(CrossRef)(Google Scholar)
[4] A. Sudarsanam and S. Malik, “Memory bank and register allocation in software synthesis for ASIPs,” In Proceedings of the IEEE/ACM International Conference on Computer Aided Design, pp.388-392, (1995)
[5] A. Sudarsanam and S. Malik, “Simultaneous reference allocation in code generation for dual data memory bank ASIPs,” Journal of the ACM Transactions on Automation of Electronic Systems (TODAES), vol.5, pp.242-264, (2000)
[6] R. Leupers and D. Kotte, “Variable partitioning for dual memory bank DSPs,” In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ASSP), vol.2, pp.1121-1124, (2001) DOI: 10.1109/ICASSP.2001.941118(CrossRef)(Google Scholar)
[7] J. Cho, Y. Paek, and D. Whalley, “Efficient register and memory assignment for non-orthogonal architectures via graph coloring and MST algorithm,” In Proceedings of the International Conference on the LCTES and SCOPES, Berlin, Germany, vol.37, no.7, pp.130-138, (2002) DOI: 10.1145/513829.513853(CrossRef)(Google Scholar)
[8] X. Zhuang, S. Pande, and J.S. Greenland, “A framework for parallelizing load/stores on embedded processors,” In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), Virginia, (2002) DOI: 10.1109/PACT.2002.1106005(CrossRef)(Google Scholar)
[9] Q. Zhuge, B. Xiao, and E.H.-M. Sha, “Variable partitioning and scheduling of multiple memory architectures for DSP,” In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS), (2002)
[10] Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren, “The program dependence graph and its use in optimization,” ACM Trans. Program. Lang. Syst., vol.9, no.3, pp.319-349, (1987)
[11] Chunho Lee, M. Potkonjak and W. H. Mangione-Smith, “MediaBench: a tool for evaluating and synthesizing multimedia and communications systems,” Proceedings of 30th Annual International Symposium on Microarchitecture, Research Triangle Park, NC, USA, pp.330-335, (1997) DOI: 10.1109/MICRO.1997.645830(CrossRef)(Google Scholar)
[12] Poovey Jason, Conte Thomas, Levy Markus, and Gal-On Shay, “A benchmark characterization of the EEMBC benchmark suite,” Micro, IEEE, vol.29, no.5, pp.18-29, (2009) DOI: 10.1109/MM.2009.74(CrossRef)(Google Scholar)