efficientnetb0-exp1.log 26 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184
  1. nohup: ignoring input
  2. [W Context.cpp:69] Warning: torch.set_deterministic is in beta, and its design and functionality may change in the future. (function operator())
  3. [08 06:15:54 <frozen super_pulsar.proto.configuration_super_pulsar_manip>:39] WRN migrate all stand-alone args into a single task
  4. [08 06:15:54 <frozen super_pulsar.proto.configuration_super_pulsar_manip>:159] set task task_0's 0th input model path as /root/axera/axera-quan-hjj/model/efficientnet_size-160.onnx
  5. [08 06:15:54 <frozen super_pulsar.proto.configuration_super_pulsar_manip>:178] set task task_0's 0th output model path as /root/axera/axera-quan-hjj/joint/efficientnetb0-handpose.joint
  6. [08 06:15:54 <frozen super_pulsar.proto.configuration_super_pulsar_manip>:297] set task task_0's pulsar_conf.output_dir as /root/axera/axera-quan-hjj
  7. /opt/venv/lib/python3.6/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  8. return torch._C._cuda_getDeviceCount() > 0
  9. [08 06:15:55 <frozen super_pulsar.func_wrappers.wrapper_pulsar_build>:17] planning task task_0
  10. [08 06:15:55 <frozen super_pulsar.func_wrappers.pulsar_build.neuwizard_step>:459] WRN affine_preprocess at QAT model compiling is deprecated, insert enforce_integers at front, please use scale_to_integers instead.
  11. [08 06:15:55 <frozen super_pulsar.func_wrappers.wrapper_pulsar_build>:340] ################## Running task task_0 ##################
  12. [08 06:15:55 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:30] python3 /root/python_modules/super_pulsar/super_pulsar/toolchain_wrappers/wrapper_neuwizard.py --config /tmp/tmpmy092f06.prototxt
  13. [08 06:15:55 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] [W Context.cpp:69] Warning: torch.set_deterministic is in beta, and its design and functionality may change in the future. (function operator())
  14. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] ONNX Model Version 12 for "/root/axera/axera-quan-hjj/model/efficientnet_size-160.onnx"
  15. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning load step finished; elapsed time: 0.01s
  16. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step finished; elapsed time: 0.01s
  17. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step finished; elapsed time: 0.00s
  18. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step finished; elapsed time: 0.00s
  19. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step to "native" finished; elapsed time: 0.16s
  20. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step to "native_no_bn" finished; elapsed time: 0.00s
  21. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step to "pretransformed" finished; elapsed time: 1.77s
  22. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning calibrate step finished; elapsed time: 0.01s
  23. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step to "transformed" finished; elapsed time: 47.78s
  24. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step to "posttransformed" finished; elapsed time: 6.22s
  25. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step to "magma" finished; elapsed time: 1.51s
  26. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step to "magma_validified" finished; elapsed time: 0.34s
  27. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning transform step to "lava_with_rtv" finished; elapsed time: 10.77s
  28. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning dump_joint_model step finished; elapsed time: 0.00s
  29. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning evaluate step finished; elapsed time: 0.00s
  30. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning ir_bit_macs step finished; elapsed time: 0.00s
  31. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning ir_bit_float_params step finished; elapsed time: 0.00s
  32. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Planning ir_bit_quantized_params step finished; elapsed time: 0.00s
  33. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Loading model finished; elapsed time: 0.00s
  34. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "onnx_step_1" finished; elapsed time: 0.01s
  35. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "onnx_step_2" finished; elapsed time: 0.00s
  36. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "onnx" finished; elapsed time: 0.00s
  37. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "native" finished; elapsed time: 0.07s
  38. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "native_no_bn" finished; elapsed time: 0.00s
  39. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "pretransformed" finished; elapsed time: 0.66s
  40. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] /opt/venv/lib/python3.6/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
  41. [08 06:17:06 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] return torch._C._cuda_getDeviceCount() > 0
  42. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Calibrating finished; elapsed time: 7.91s
  43. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "transformed" finished; elapsed time: 8.75s
  44. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "posttransformed" finished; elapsed time: 1.83s
  45. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "magma" finished; elapsed time: 0.68s
  46. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "magma_validified" finished; elapsed time: 0.06s
  47. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "lava_with_rtv" finished; elapsed time: 4.61s
  48. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Dynamically planning transform step to "lava" finished; elapsed time: 0.02s
  49. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "lava" dynamically finished; elapsed time: 0.06s
  50. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Dynamically planning transform step to "lava_onnx" finished; elapsed time: 0.55s
  51. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "lava_onnx" dynamically finished; elapsed time: 0.78s
  52. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Dynamically planning transform step to "lava_onnx_axe" finished; elapsed time: 0.02s
  53. [08 06:17:30 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Transforming to "lava_onnx_axe" dynamically finished; elapsed time: 0.06s
  54. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] /root/python_modules/neuwizard-latest/neuwizard/operators/lava/AX620/Conv2d.py:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:141.)
  55. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Joint model dumpped as "/root/axera/axera-quan-hjj/joint/model.lava_joint"
  56. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Dumping Joint Model finished; elapsed time: 4.17s
  57. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Evaluation is not performed.
  58. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Evaluating finished; elapsed time: 0.00s
  59. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Overview Table of Bit MACs
  60. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] | Domain | native | pretransformed | transformed | posttransformed | magma | lava |
  61. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] |----------|----------|------------------|---------------|-------------------|---------|--------|
  62. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] | Bit MACs | 13G | 52G | 53G | 61G | 59G | 60G |
  63. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Bit MACs measurement for each domain finished; elapsed time: 0.37s
  64. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Overview Table of parameter size
  65. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] | Domain | native | pretransformed | transformed | posttransformed | magma | lava |
  66. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] |----------------------|----------|------------------|---------------|-------------------|---------|--------|
  67. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] | Parameter Size(bits) | 127M | 326M | 329M | 339M | 339M | 339M |
  68. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Float Parameter size measurement for each domain finished; elapsed time: 0.54s
  69. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Overview Table of parameter size
  70. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] | Domain | native | pretransformed | transformed | posttransformed | magma | lava |
  71. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] |----------------------|----------|------------------|---------------|-------------------|---------|--------|
  72. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] | Parameter Size(bits) | 32M | 81M | 82M | 85M | 84M | 84M |
  73. [08 06:17:36 <frozen super_pulsar.toolchain_wrappers.wrapper_neuwizard>:36] DBG [neuwizard] Quantized Parameter size measurement for each domain finished; elapsed time: 0.63s
  74. [08 06:17:39 <frozen super_pulsar.toolchain_wrappers.wrapper_joint>:509] set op_8341's shape as [1, 8]
  75. [08 06:17:39 <frozen super_pulsar.toolchain_wrappers.wrapper_joint>:509] set op_8341's shape as [1, 8]
  76. [08 06:17:41 <frozen super_pulsar.toolchain_wrappers.wrapper_toolchain>:535] DBG working in "/root/tmpo0o7ztk9"
  77. [08 06:17:41 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:227] python3 pulsar.py gen /root/tmpo0o7ztk9/part_0.lava/part_0.lava env/ax620a_virtual_111_config.ini -b 1 -pe 16 --times_thres 0 --job_stealing 3 --checkall --param_compress --continuous_input --no_sim --hyper_params run_cf.wait_mode=True
  78. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  79. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] inference_report.log:
  80. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  81. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] |:-------------------------|:-----------------------------|:-------------|:-------------------|
  82. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  83. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] |:----------------|-------------:|---------------:|
  84. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  85. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] |:---------------|:-----------------|:-------------|:--------------|:------------|:---------------|:---------------|:-------------------|:---------------|
  86. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  87. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] |:-----------------------|:------------------|:------------------|:------------------|:--------------------|
  88. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  89. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] |:-----------------------|:-----------------|:--------------------|:--------------------|:--------------------|:------------------|:-----------------|:----------------|:----------------|
  90. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  91. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] profile stream EU: ld/st_ratio might include ringbuf/linebuf/feature_swap parts; mv_ratio migth have ringbuf part.
  92. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  93. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] |:--------------------|-----------:|:--------------|:-----------------|:---------------|:----------------------|:--------------|
  94. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  95. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] |:---------------------------------------------|:----------------|:-----------------------|
  96. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  97. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] |:-----------|-----------:|----------:|:--------|-------:|----------:|
  98. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  99. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] inference: 23.6 ms, 42.35 fps
  100. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] qps = fps * batch_size = 42.35
  101. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  102. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] simulated fps is based on DDR_BW: 1.59 GB/s
  103. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  104. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] DDR IO stats:
  105. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] ideal_input_data_size: 14116608 Byte,
  106. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] ideal_output_data_size: 32 Byte,
  107. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] extra_mid_io_data_size: 7109136 Byte,
  108. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] total_io_data_size: 21225776 Byte
  109. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  110. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] MAC per inference: 628214464 MAC@int8
  111. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] MAC utils: 2.89 %
  112. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  113. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] commit_id:
  114. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  115. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] |:------------------------------------|-----------:|:-------------|
  116. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  117. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] subgraph num: 4
  118. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] 
  119. [08 06:25:32 <frozen super_pulsar.toolchain_wrappers.wrapper_pulsar_compiler>:250] DBG [pulsar] pulsar.py totally used 470s
  120. [08 06:25:38 <frozen super_pulsar.toolchain_wrappers.wrapper_toolchain>:582] File saved: /root/axera/axera-quan-hjj/joint/efficientnetb0-handpose.joint
  121. [08 06:25:38 <frozen super_pulsar.toolchain_wrappers.wrapper_toolchain>:587] DBG cleared /root/tmpo0o7ztk9
  122. | Pre_alloc OCM | linebuffer(may SWAP later) | ringbuffer | parameter |
  123. | size(ratio of whole OCM) | 0(0.0)% | 0(0.0)% | 116736(5.6)% |
  124. | range | (None, None) | (None, None) | (1980416, 2097152) |
  125. | Pre_alloc DDR | ringbuffer | feature_swap |
  126. | size(M) | 0.00 | 0.00 |
  127. | profile conv | work_cyc | linebuf | warmup_tail | core_idle | io_idle | stride2_idle | standalone_fetch | MAC |
  128. | ratio in conv | 2772180 (100.0%) | 32131 (1.2%) | 240800 (8.7%) | 0 (0.0%) | 710439 (25.6%) | 831608 (30.0%) | 210 (0.0%) | 545325 (19.7%) |
  129. | profile ideal DDR_IO | min_io_sum | min_params_read | min_inputs_read | min_outputs_write |
  130. | DDR IO size (Byte) | 14116640 (100.0%) | 14039808 (99.5%) | 76800 (0.5%) | 32 (0.0%) |
  131. | profile extra DDR_IO | extra_ddr_io | extra_params_read | extra_inputs_read | extra_outputs_wrt | extra_swap_read | extra_swap_wrt | ddr_rb_read | ddr_rb_wrt |
  132. | DDR IO size (Byte) | 7109136 (100.0%) | 0 (0.0%) | 1157384 (16.3%) | 1106184 (15.6%) | 0 (0.0%) | 0 (0.0%) | 2422784 (34.1%) | 2422784 (34.1%) |
  133. | profile stream EU | work_cyc | ld_ratio | ld_param_ratio | mv_ratio | mv_linebuffer_ratio | st_ratio |
  134. | teng | 18636864 | 1844691(9.9%) | 7460162(40.0%) | 7509589(40.3%) | 40684(0.2%) | 1781737(9.6%) |
  135. | breakdown of mv_ratio in profile stream EU | teng | all_eus with mv-cmds |
  136. | total_cyc_num | 7509589(100.0%) | 7509589(100.0%) |
  137. | weight0_mode,convNxM,mode23,nopad | 2404764(32.0%) | 2404764(32.0%) |
  138. | teng_binary_mul | 1056050(14.1%) | 1056050(14.1%) |
  139. | mv,affine,unpack_lsb | 565295(7.5%) | 565295(7.5%) |
  140. | mv,broadcast | 483148(6.4%) | 483148(6.4%) |
  141. | teng_binary_sub | 451312(6.0%) | 451312(6.0%) |
  142. | mv,resize | 439020(5.8%) | 439020(5.8%) |
  143. | mv,padding | 389020(5.2%) | 389020(5.2%) |
  144. | mv,subtensor | 365393(4.9%) | 365393(4.9%) |
  145. | mv,concat_c | 328182(4.4%) | 328182(4.4%) |
  146. | teng_binary_add | 308771(4.1%) | 308771(4.1%) |
  147. | mv,pack_lsb | 267240(3.6%) | 267240(3.6%) |
  148. | weight0_mode,convNxM,mode20,nopad | 251040(3.3%) | 251040(3.3%) |
  149. | conv_transform | 85377(1.1%) | 85377(1.1%) |
  150. | mv,padding_ch | 63717(0.8%) | 63717(0.8%) |
  151. | const | 44401(0.6%) | 44401(0.6%) |
  152. | sigmoid | 6129(0.1%) | 6129(0.1%) |
  153. | weight0_mode,mode26,fc | 501(0.0%) | 501(0.0%) |
  154. | revert_split | 229(0.0%) | 229(0.0%) |
  155. | EU | work_cyc | tot_cyc | ratio | fps | fps_bnd |
  156. | conv-1core | 2772181 | 18888839 | 14.0% | 42.350 | 288.580 |
  157. | teng | 18636864 | 18888839 | 98.0% | 42.350 | 42.930 |
  158. | breakdown of cmds_num for each op | cmds_num | percentage |
  159. | weight0_mode,convNxM,mode23,nopad | 4590 | 35.20% |
  160. | mv,resize | 1306 | 10.01% |
  161. | mv,padding | 929 | 7.12% |
  162. | mv,affine,unpack_lsb | 913 | 7.00% |
  163. | mv,concat_c | 863 | 6.62% |
  164. | mv,subtensor | 748 | 5.74% |
  165. | weight0_mode,mode23,nopad | 699 | 5.36% |
  166. | weight0_mode,convNxM,mode26,nopad | 585 | 4.49% |
  167. | weight0_mode,mode23 | 543 | 4.16% |
  168. | weight0_mode,convNxM,mode20,nopad | 492 | 3.77% |
  169. | weight0_mode,mode20,nopad | 359 | 2.75% |
  170. | const | 199 | 1.53% |
  171. | conv_transform | 152 | 1.17% |
  172. | revert_split | 124 | 0.95% |
  173. | teng_binary_mul | 118 | 0.90% |
  174. | mv,padding_ch | 87 | 0.67% |
  175. | teng_binary_sub | 77 | 0.59% |
  176. | mv,broadcast | 76 | 0.58% |
  177. | teng_binary_add | 66 | 0.51% |
  178. | mv,pack_lsb | 53 | 0.41% |
  179. | sigmoid | 35 | 0.27% |
  180. | weight0_mode,mode20 | 13 | 0.10% |
  181. | weight0_mode,mode26,fc | 8 | 0.06% |
  182. | bgr_pack | 6 | 0.05% |
  183. | total_num | 13041 | 100% |
  184. 06:25:37 [I]final_check: FINAL_CHECK: output op_8341 check successed!