Implementación de Algoritmos de Procesamiento Digital de Señales en Hardware Paralelo: Artículo de revisión

Gabriel Bravo Martínez, Jesús Martin Silva Aceves, Soledad Vianey Torres Argüelles, Francisco Javier Enríquez Aguilera



DOI: http://dx.doi.org/10.20983/culcyt.2018.3.10

Resumen


Sobre el procesamiento digital de señales con sistemas de computadoras con capacidades genéricas, en su mayoría de un solo procesador multinúcleo

Palabras clave


Procesamiento digital de señales, Algoritmos, Hardware paralelo

Texto completo:

PDF

Referencias


Arndt, O. J., Linde, T. and Blume, H. 2015 ‘Implementation and analysis of the histograms of oriented Gradients algorithm on a heterogeneous multicore CPU/GPU architecture’, 2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 1402–1406. doi: 10.1109/GlobalSIP.2015.7418429.

Atweh, H. K. et al. 2018 ‘Parallelization of gradient-based edge detection algorithm on multicore processors’, in 2018 Sixth International Conference on Digital Information, Networking, and Wireless Communications (DINWC). IEEE, pp. 59–64. doi: 10.1109/DINWC.2018.8356996.

Bartok, R. and Vasarhelyi, J. 2015 ‘A fuzzy rule interpolation base algorithm implementation on different platforms’, in Proceedings of the 2015 16th International Carpathian Control Conference (ICCC). IEEE, pp. 37–40. doi: 10.1109/CarpathianCC.2015.7145041.

Bistaffa, F., Bombieri, N. and Farinelli, A. 2017 ‘An Efficient Approach for Accelerating Bucket Elimination on GPUs’, IEEE Transactions on Cybernetics, 47(11), pp. 3967–3979. doi: 10.1109/TCYB.2016.2593773.

Bozejko, W., Dobrucki, A. and Walczynski, M. 2010 ‘Parallelizing of digital signal processing with using GPU’, pp. 29–33.

Brauer, P., Lundqvist, M. and Mallo, A. 2016 ‘Improving Latency in a Signal Processing System on the Epiphany Architecture’, Proceedings - 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2016, pp. 796–800. doi: 10.1109/PDP.2016.51.

Caffarena, G. et al. 2010 ‘Fast fixed-point optimization of DSP algorithms’, Proceedings of the 2010 18th IEEE/IFIP International Conference on VLSI and System-on-Chip, VLSI-SoC 2010, pp. 195–200. doi: 10.1109/VLSISOC.2010.5642659.

Carvalho, C. B. G., Ferreira, V. C. and Franca, F. M. G. 2017 ‘Towards a Dataflow Runtime Environment for Edge, Fog and In-Situ Computing’, Performance Computing. doi: 10.1109/SBAC-PADW.2017.28.

Chen, G. et al. 2015 ‘Enabling Portable Optimizations of Data Placement on GPU’, IEEE Micro, 35(4), pp. 16–24. doi: 10.1109/MM.2015.53.

Chen, J. et al. 2012 ‘A Hybrid Architecture for Compressive Sensing 3-D CT Reconstruction’, Emerging and Selected Topics in Circuits and Systems, IEEE Journal on, 2(3), pp. 616–625. doi: 10.1109/JETCAS.2012.2221530.

Chen, J. et al. 2017 ‘A Hybrid Power-Performance Adjustment Strategy for Clustered Multi-threading Architecture’, Proceedings - 18th IEEE International Conference on High Performance Computing and Communications, 14th IEEE International Conference on Smart City and 2nd IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2016, pp. 292–300. doi: 10.1109/HPCC-SmartCity-DSS.2016.0050.

Chen, L. et al. 2015 ‘A review of parallel computing for large-scale remote sensing image mosaicking’, Cluster Computing. Springer US, 18(2), pp. 517–529. doi: 10.1007/s10586-015-0422-3.

Choi, J. and Rutenbar, R. A. 2016 ‘Video-Rate Stereo Matching Using Markov Random Field TRW-S Inference on a Hybrid CPU+FPGA Computing Platform’, IEEE Transactions on Circuits and Systems for Video Technology, 26(2), pp. 385–398. doi: 10.1109/TCSVT.2015.2397198.

Cicuttin, M. et al. 2018 ‘GPU Accelerated Time-Domain Discrete Geometric Approach Method for Maxwell’s Equations on Tetrahedral Grids’, IEEE Transactions on Magnetics, 54(3). doi: 10.1109/TMAG.2017.2753322.

Dagum, L. and Menon, R. 1998 ‘OpenMP: an industry standard API for shared-memory programming’, IEEE Computational Science and Engineering, 5(1), pp. 46–55. doi: 10.1109/99.660313.

Das, S. et al. 2018 ‘A Heterogeneous Cluster with Reconfigurable Accelerator for Energy Efficient Near-Sensor Data Analytics’, in 2018 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, pp. 1–5. doi: 10.1109/ISCAS.2018.8351749.

Datta, A. K. and Patel, R. 2014 ‘CPU Scheduling for power/energy management on multicore processors using cache miss and context switch data’, IEEE Transactions on Parallel and Distributed Systems, 25(5), pp. 1190–1199. doi: 10.1109/TPDS.2013.148.

Dine, A. et al. 2016 ‘Graph-Based Simultaneous Localization and Mapping: Computational Complexity Reduction on a Multicore Heterogeneous Architecture’, IEEE Robotics & Automation Magazine, 23(4), pp. 160–173. doi: 10.1109/MRA.2016.2580466.

Faber, Ł. and Boryczko, K. 2016 ‘Efficient parallel execution of genetic algorithms on Epiphany manycore processor’, in, pp. 865–872. doi: 10.15439/2016F255.

Feng, Z., Zeng, Z. and Li, P. 2011 ‘Parallel on-chip power distribution network analysis on multi-core-multi-GPU platforms’, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 19(10), pp. 1823–1836. doi: 10.1109/TVLSI.2010.2059718.

Le Gal, B. and Jego, C. 2016 ‘High-Throughput Multi-Core LDPC Decoders Based on x86 Processor’, IEEE Transactions on Parallel and Distributed Systems, 27(5), pp. 1373–1386. doi: 10.1109/TPDS.2015.2435787.

Gao, F., Huang, Z., Wang, S., et al. 2017 ‘A hybrid clock synchronization architecture for many-core cluster system based on GPS and IEEE 1588’, 2016 2nd IEEE International Conference on Computer and Communications, ICCC 2016 - Proceedings, pp. 2645–2649. doi: 10.1109/CompComm.2016.7925177.

Gao, F., Huang, Z., Wang, Z., et al. 2017 ‘An object detection acceleration framework based on low-power heterogeneous manycore architecture’, 2016 IEEE 3rd World Forum on Internet of Things, WF-IoT 2016, pp. 597–602. doi: 10.1109/WF-IoT.2016.7845407.

Gawande, G. S., Metkar, S. S. and Khanchandani, K. B. 2016 ‘Performance enhancement of multirate digital filter structures using advanced DSP optimization techniques’, Proceedings - IEEE International Conference on Information Processing, ICIP 2015, pp. 307–311. doi: 10.1109/INFOP.2015.7489398.

Gegout, P. et al. 2014 ‘Ray-tracing of GNSS signal through the atmosphere powered by CUDA, HMPP and GPUs technologies’, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7(5), pp. 1592–1602. doi: 10.1109/JSTARS.2013.2272600.

Gener, Y. S., Yildiz, A. and Goren, S. 2016 ‘Low-cost and low-power video filtering with parallel many cores’, in ELECO 2015 - 9th International Conference on Electrical and Electronics Engineering, pp. 921–925. doi: 10.1109/ELECO.2015.7394502.

Gomez-Luna, J. et al. 2016 ‘In-Place Matrix Transposition on GPUs’, IEEE Transactions on Parallel and Distributed Systems, 27(3), pp. 776–788. doi: 10.1109/TPDS.2015.2412549.

Haidar, A. et al. 2017 ‘A Guide For Achieving High Performance With Very Small Matrices On GPU: A case Study of Batched LU and Cholesky Factorizations’, IEEE Transactions on Parallel and Distributed Systems, 29(5), pp. 973–984. doi: 10.1109/TPDS.2017.2783929.

Hegde, G. (School of Computer Science and Engineering, Nanyang Technological University, S. 639798) et al. 2016 ‘CaffePresso: An optimized library for Deep Learning on embedded accelerator-based platforms’, International Conference on Compliers, Architectures, and Sythesis of Embedded Systems (CASES), pp. 1–10. doi: 10.1145/2968455.2968511.

Hellestrand, G. R. 1996 ‘VLSI Register , Ins tion and Data Caches Suited to on Chip Support for Real-Time Multi-Media Applications’, pp. 1–4.

Hou, N. et al. 2017 ‘A Parallel Genetic Algorithm with Dispersion Correction for HW/SW Partitioning on Multicore CPU and Many-core GPU’, IEEE Access, XX(c). doi: 10.1109/ACCESS.2017.2776295.

Huynh, B., Vo, B. and Snasel, V. 2017 ‘An Efficient Parallel Method for Mining Frequent Closed Sequential Patterns’, IEEE Access, 5, pp. 17392–17402. doi: 10.1109/ACCESS.2017.2739749.

Hwang, I. and Pedram, M. 2016 ‘A Comparative Study of the Effectiveness of CPU Consolidation Versus Dynamic Voltage and Frequency Scaling in a Virtualized Multicore Server’, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 24(6), pp. 2103–2116. doi: 10.1109/TVLSI.2015.2499601.

Ichnowski, J. and Alterovitz, R. 2014 ‘Scalable multicore motion planning using lock-free concurrency’, IEEE Transactions on Robotics, 30(5), pp. 1123–1136. doi: 10.1109/TRO.2014.2331091.

Immune, U., Learning, D. and Optimization, S. 2016 ‘An Enhanced Approach for Parameter Estimation’, IEEE Systems, Man, and Cybernetics Magazine, 2,(June), pp. 26–33. doi: 0.1109/MSMC.2015.2472915.

Intel Corporation (2017) POWER YOUR CREATIVITY WITH THE INTEL ® CORE TM X-SERIES INTEL ® CORE TM X-SERIES PROCESSOR FAMILY. Available at: https://www.intel.la/content/dam/www/public/us/en/documents/product-briefs/core-x-series-processor-family-product-brief.pdf (Accessed: 10 August 2018).

Jing, M. et al. 2018 ‘An Automatic Task Partition Method for Multi-core System’, in 2018 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, pp. 1–5. doi: 10.1109/ISCAS.2018.8351528.

Kalla, B. et al. 2017 ‘A Probabilistic Monte Carlo Framework for Branch Prediction’, Proceedings - IEEE International Conference on Cluster Computing, ICCC, 2017–Septe, pp. 651–652. doi: 10.1109/CLUSTER.2017.29.

Kanduri, A. et al. 2017 ‘Accuracy-aware power management for many-core systems running error-resilient applications’, Ieeexplore.Ieee.Org, 25(10), pp. 2749–2762. Available at: http://ieeexplore.ieee.org/abstract/document/7914763/.

Kee, C. Y. and Wang, C. F. 2013 ‘Efficient GPU implementation of the high-frequency SBR-PO method’, IEEE Antennas and Wireless Propagation Letters, 12, pp. 941–944. doi: 10.1109/LAWP.2013.2274802.

Kehl, W. et al. 2017 ‘Real-Time 3D Model Tracking in Color and Depth on a Single CPU Core’, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 465–473. doi: 10.1109/CVPR.2017.57.

Khorgade, M. P. and Dakhole, P. 2016 ‘Optimization of reconfigurable fabric of DSP processor with image processing’, International Conference on Electrical, Electronics, and Optimization Techniques, ICEEOT 2016, pp. 1799–1801. doi: 10.1109/ICEEOT.2016.7754997.

Kim, T. J. et al. 2010 ‘RACBVHs: Random-accessible compressed bounding volume hierarchies’, IEEE Transactions on Visualization and Computer Graphics, 16(2), pp. 273–286. doi: 10.1109/TVCG.2009.71.

Kulkarni, A. et al. 2017 ‘An Energy-Efficient Programmable Manycore Accelerator for Personalized Biomedical Applications’, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 26(1), pp. 96–109. doi: 10.1109/TVLSI.2017.2754272.

Kumashiro, S. et al. 2017 ‘An Accurate Metric to Control Time Step of Transient Device Simulation by Matrix Exponential Method’, pp. 37–40.

Kurth, A. et al. 2016 ‘Mobile Ultrasound Imaging on Heterogeneous Multi-Core Platforms’, Proceedings of the 14th ACM/IEEE Symposium on Embedded Systems for Real-Time Multimedia - ESTIMedia’16, pp. 9–18. doi: 10.1145/2993452.2993565.

Labowski, K. L. et al. 2016 ‘Implementing Hilbert transform for Digital Signal Processing on epiphany many-core coprocessor’, 2016 IEEE High Performance Extreme Computing Conference, HPEC 2016. doi: 10.1109/HPEC.2016.7761638.

Lastovetsky, A. and Reddy Manumachu, R. 2017 ‘New Model-Based Methods and Algorithms for Performance and Energy Optimization of Data Parallel Applications on Homogeneous Multicore Clusters’, IEEE Transactions on Parallel and Distributed Systems, 28(4), pp. 1119–1133. doi: 10.1109/TPDS.2016.2608824.

Li, C. et al. 2013 ‘IMDCT optimization of AVS-P3 decoding algorithm in DSP’, 2013 IEEE International Conference on Information and Automation, ICIA 2013, (12), pp. 1364–1368. doi: 10.1109/ICInfA.2013.6720506.

Li, J. et al. 2015 ‘Accelerating MRI reconstruction via three-dimensional dual-dictionary learning using CUDA’, Journal of Supercomputing. Springer US, 71(7), pp. 2381–2396. doi: 10.1007/s11227-015-1386-z.

Li, K., Zhu, Y. and Tian, Y. 2010 ‘Implementation and optimization of a weed identification algorithm on the DSP with C64+ core’, ISPACS 2010 - 2010 International Symposium on Intelligent Signal Processing and Communication Systems, Proceedings, 6437, pp. 8–11. doi: 10.1109/ISPACS.2010.5704792.

Li, L. et al. 2015 ‘A Parallel Algorithm for Game Tree Search Using GPGPU’, IEEE Transactions on Parallel and Distributed Systems, 26(8), pp. 2114–2127. doi: 10.1109/TPDS.2014.2345054.

Li, W. et al. 2017 ‘GPU Parallel Implementation of Isometric Mapping for Hyperspectral Classification’, IEEE Geoscience and Remote Sensing Letters, 14(9), pp. 1532–1536. doi: 10.1109/LGRS.2017.2720778.

Li, Y. and Huang, X. 2017 ‘High speed communication and realization between FPGA and DSP in software-defined radio system’, in 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET). IEEE, pp. 2329–2332. doi: 10.1109/WiSPNET.2017.8300176.

Ling-bin, K. et al. 2010 ‘Realization and Optimization of Face Detection Algorithm Based on DSP’, Information Science and Management Engineering (ISME), 2010 International Conference of, 1, pp. 246–249. doi: 10.1109/ISME.2010.74.

Liu, X. X., Yu, H. and Tan, S. X. D. 2015 ‘A GPU-accelerated parallel shooting algorithm for analysis of radio frequency and microwave integrated circuits’, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 23(3), pp. 480–492. doi: 10.1109/TVLSI.2014.2309606.

Luo, Y. and Chen, Y. 2014 ‘Optimization of the SIFT key algorithms on multi-core DSP systems’, Proceedings of 2013 3rd International Conference on Computer Science and Network Technology, ICCSNT 2013, pp. 969–973. doi: 10.1109/ICCSNT.2013.6967265.

Luque, C. et al. 2012 ‘CPU accounting for multicore processors’, IEEE Transactions on Computers, 61(2), pp. 251–264. doi: 10.1109/TC.2011.152.

Manumachu, R. R. and Lastovetsky, A. 2018 ‘Bi-Objective Optimization of Data-Parallel Applications on Homogeneous Multicore Clusters for Performance and Energy’, IEEE Transactions on Computers, 67(2), pp. 160–177. doi: 10.1109/TC.2017.2742513.

Margara, A. and Cugola, G. 2014 ‘High-performance publish-subscribe matching using parallel hardware’, IEEE Transactions on Parallel and Distributed Systems, 25(1), pp. 126–135. doi: 10.1109/TPDS.2013.39.

Masegosa, A. R., Martinez, A. M. and Borchani, H. 2016 ‘Probabilistic Graphical Models on Multi-Core CPUs Using Java 8’, IEEE Computational Intelligence Magazine, 11(2), pp. 41–54. doi: 10.1109/MCI.2016.2532267.

Mastelic, T., Brandic, I. and Jaarevic, J. 2015 ‘CPU performance coefficient (CPU-PC): A novel performance metric based on real-time CPU resource provisioning in time-shared cloud environments’, Proceedings of the International Conference on Cloud

Computing Technology and Science, CloudCom, 2015–Febru(February), pp. 408–415. doi: 10.1109/CloudCom.2014.13.

Mendat, D. R. et al. 2016 ‘Neuromorphic sampling on the SpiNNaker and parallella chip multiprocessors’, LASCAS 2016 - 7th IEEE Latin American Symposium on Circuits and Systems, R9 IEEE CASS Flagship Conference, pp. 399–402. doi: 10.1109/LASCAS.2016.7451094.

Mielikainen, J. et al. 2016 ‘GPU Compute Unified Device Architecture (CUDA)-based Parallelization of the RRTMG Shortwave Rapid Radiative Transfer Model’, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(2), pp. 921–931. doi: 10.1109/JSTARS.2015.2427652.

Mohammadi, M. et al. 2018 ‘A Hardware Architecture for Radial Basis Function Neural Network Classifier’, IEEE Transactions on Parallel and Distributed Systems, 29(3), pp. 481–495. doi: 10.1109/TPDS.2017.2768366.

Muniyandi, R. C. and Maroosi, A. 2015 ‘Enhancing the simulation of membrane system on the GPU for the n-queens problem’, Chinese Journal of Electronics, 24(4), pp. 740–743. doi: 10.1049/cje.2015.10.012.

Mustafa, B., Shahana, R. and Ahmed, W. 2015 ‘Parallel implementation of Doolittle Algorithm using OpenMP for multicore machines’, Souvenir of the 2015 IEEE International Advance Computing Conference, IACC 2015, pp. 575–578. doi: 10.1109/IADCC.2015.7154772.

Nakata, K. and Ito, Y. 2017 ‘An evaluation of the parallella architecture for the convex hull computation’, in Proceedings - 2016 4th International Symposium on Computing and Networking, CANDAR 2016. IEEE, pp. 704–706. doi: 10.1109/CANDAR.2016.8.

Ng, R. Y. F., Tay, Y. H. and Mok, K. M. 2009 ‘DSP-based implementation and optimization of an iris verification algorithm using textural feature’, 6th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2009, 5, pp. 374–378. doi: 10.1109/FSKD.2009.757.

Olofsson, A., Nordström, T. and Ul-Abdin, Z. 2015 ‘Kickstarting high-performance energy-efficient manycore architectures with Epiphany’, Conference Record - Asilomar Conference on Signals, Systems and Computers, 2015–April, pp. 1719–1726. doi: 10.1109/ACSSC.2014.7094761.

Park, S. J. et al. 2011 ‘Hybrid core acceleration of UWB SIRE radar signal processing’, IEEE Transactions on Parallel and Distributed Systems, 23(1), pp. 46–57. doi: 10.1109/TPDS.2010.117.

Pikacz, B. and Gambrych, J. 2014 ‘Vector implementation of the fast Fourier transform on DSP and NVIDIA CUDA platforms’, in 2014 10th Conference on Ph.D. Research in Microelectronics and Electronics (PRIME). IEEE, pp. 1–4. doi: 10.1109/PRIME.2014.6872729.

Possa, P. R. et al. 2014 ‘A multi-resolution FPGA-based architecture for real-time edge and corner detection’, IEEE Transactions on Computers, 63(10), pp. 2376–2388. doi: 10.1109/TC.2013.130.

Prongnuch, S. and Wiangtong, T. 2016 ‘Heterogeneous Computing Platform for data processing’, 2016 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 1–4. doi: 10.1109/ISPACS.2016.7824762.

Qiwei Cao and Liuchen Chang 2016 ‘Genetic Algorithm Optimization for High-Performance VSI-Fed Permanent Magnet Synchronous Motor Drives’, 37th IEEE Power Electronics Specialists Conference, pp. 1–7. doi: 10.1109/PESC.2006.1711959.

Ramalakshmi, E. and Kompala, N. 2017 ‘Multi-threading image processing in single-core and multi-core CPU using R language’, in 2017 Second International Conference on Electrical, Computer and Communication Technologies (ICECCT). IEEE, pp. 1–5. doi: 10.1109/ICECCT.2017.8117873.

Richter, C., Schops, S. and Clemens, M. 2014 ‘GPU acceleration of algebraic multigrid preconditioners for discrete elliptic field problems’, IEEE Transactions on Magnetics, 50(2), pp. 1–4. doi: 10.1109/TMAG.2013.2283099.

Ries, F., De Marco, T. and Guerrieri, R. 2012 ‘Triangular matrix inversion on heterogeneous multicore systems’, IEEE Transactions on Parallel and Distributed Systems, 23(1), pp. 177–184. doi: 10.1109/TPDS.2011.103.

Rinku, D. R. 2017 ‘Analysis of multi-threading time metric on single and multi-core CPUs with Matrix multiplication’, pp. 3–6.

Rzeszutek, R. et al. 2010 ‘An advantageous rotoscoping method’, IEEE Signal Processing Magazine, 27(2), pp. 34–39. doi: 10.1109/MSP.2009.935392.

Van De Sande, K. E. A., Gevers, T. and Snoek, C. G. M. 2011 ‘Empowering visual categorization with the GPU’, IEEE Transactions on Multimedia, 13(1), pp. 60–70. doi: 10.1109/TMM.2010.2091400.

Sevilla, J., Martin, G. and Nascimento, J. M. P. 2016 ‘Parallel Hyperspectral Unmixing Method via Split Augmented Lagrangian on GPU’, IEEE Geoscience and Remote Sensing Letters, 13(5), pp. 626–630. doi: 10.1109/LGRS.2016.2522561.

Su, Y. et al. 2016 ‘An Efficient GPU Implementation of Inclusion-Based Pointer Analysis’, IEEE Transactions on Parallel and Distributed Systems, 27(2), pp. 353–366. doi: 10.1109/TPDS.2015.2397933.

Torti, E. et al. 2014 ‘Real-time identification of hyperspectral subspaces’, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7(6), pp. 2680–2687. doi: 10.1109/JSTARS.2014.2304832.

Torti, E. et al. 2016 ‘A Hybrid CPU-GPU Real-Time Hyperspectral Unmixing Chain’, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(2), pp. 945–951. doi.10.1109/JSTARS.2015.2485399.

Varghese, A. et al. 2017 ‘Programming the Adapteva Epiphany 64-core network-on-chip coprocessor’, International Journal of High Performance Computing Applications, 31(4), pp. 285–302. doi: 10.1177/1094342015599238.

Waidyasooriya, H. M. et al. 2017 ‘OpenCL-based FPGA-platform for stencil computation and its optimization methodology’, IEEE Transactions on Parallel and Distributed Systems, 28(5), pp. 1390–1402. doi: 10.1109/TPDS.2016.2614981.

Wang, J., Xie, C. and Pan, Z. 2013 ‘Optimization of DSP to generate spectrally efficient 16QAM Nyquist-WDM signals’, IEEE Photonics Technology Letters, 25(8), pp. 772–775. doi: 10.1109/LPT.2013.2251329.

Wang, T. and Kemao, Q. 2017 ‘Parallel computing in experimental mechanics and optical measurement: A review (II)’, Optics and Lasers in Engineering. Elsevier, 50(4), p. doi: https://doi.org/10.1016/j.optlaseng.2017.06.002.

Wang, T. and Kemao, Q. 2018 ‘Parallel computing in experimental mechanics and optical measurement: A review (II)’, Optics and Lasers in Engineering. Elsevier Ltd, 104(April 2017), pp. 181–191. doi: 10.1016/j.optlaseng.2017.06.002.

Wang, Y. et al. 2016 ‘A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUs’, IEEE Transactions on Parallel and Distributed Systems, 27(1), pp. 17–30. doi: 10.1109/TPDS.2014.2387285.

Wu, J., Song, Z. and Jeon, G. 2014 ‘GPU-parallel implementation of the edge-directed adaptive intra-field deinterlacing method’, IEEE/OSA Journal of Display Technology, 10(9), pp. 746–753. doi: 10.1109/JDT.2014.2319232.

Xanthis, C. G. et al. 2014 ‘MRISIMUL: A GPU-based parallel approach to MRI simulations’, IEEE Transactions on Medical Imaging, 33(3), pp. 607–617. doi: 10.1109/TMI.2013.2292119.

Xu, K. et al. 2016 ‘A real-Time task scheduling algorithm for multicore embedded systems’, Proceedings - 2015 Chinese Automation Congress, CAC 2015, pp. 1165–1170. doi: 10.1109/CAC.2015.7382674.

Xu, S. et al. 2018 ‘PIMCH : Cooperative Memory Prefetching in Processing-In-Memory Architecture’, Asp-Dac 2018, pp. 209–214.

Yang, W. et al. 2015 ‘Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs’, IEEE Transactions on Computers, 64(9), pp. 2623–2636. doi: 10.1109/TC.2014.2366731.

Yu, D. et al. 2015 ‘A fast parallel matrix inversion algorithm based on heterogeneous multicore architectures’, 2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 903–907. doi: 10.1109/GlobalSIP.2015.7418328.

Zhang, H. et al. 2018 ‘High-Speed Visible Image Acquisition and Processing System for Plasma Shape and Position Control of EAST Tokamak’, IEEE Transactions on Plasma Science, 46(5), pp. 1312–1317. doi: 10.1109/TPS.2018.2805911.

Zhang, J., Wang, H. and Feng, W. C. 2017 ‘CuBLASTP: Fine-Grained Parallelization of Protein Sequence Search on CPU+GPU’, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 14(4), pp. 830–843. doi: 10.1109/TCBB.2015.2489662.

Zhou, G. et al. 2017 ‘GPU-accelerated batch-ACPF solution for N-1 static security analysis’, IEEE Transactions on Smart Grid, 8(3), pp. 1406–1416. doi: 10.1109/TSG.2016.2600587.


Enlaces refback

  • No hay ningún enlace refback.


Copyright (c) 2018 CULCyT

Licencia de Creative Commons
Este obra está bajo una licencia de Creative Commons Reconocimiento-NoComercial 4.0 Internacional.

Responsable de la última actualización de este número: Raúl Alfredo Meza González. Fecha de la última modificación, 5 de octubre de 2019.

Las opiniones expresadas por los autores no necesariamente reflejan la postura del editor de la publicación. Los contenidos e imágenes de la publicación estan sujetos a una licencia CC 4.0 internacional BY NC. 

 Licencia de Creative Commons