<output id="qn6qe"></output>

    1. <output id="qn6qe"><tt id="qn6qe"></tt></output>
    2. <strike id="qn6qe"></strike>

      亚洲 日本 欧洲 欧美 视频,日韩中文字幕有码av,一本一道av中文字幕无码,国产线播放免费人成视频播放,人妻少妇偷人无码视频,日夜啪啪一区二区三区,国产尤物精品自在拍视频首页,久热这里只有精品12

      [Pytorch] Transformer Engine報錯:RuntimeError: The specified pointer resides on host memory and is not registered with any CUDA device.

      [Pytorch] Transformer Engine報錯:RuntimeError: The specified pointer resides on host memory and is not registered with any CUDA device.

      問題描述

      有一天,我用Megatron-LM在跑MoE相關的代碼時,報了這個錯誤:

      Exception: The specified pointer resides on host memory and is not registered with any CUDA device.
      Traceback (most recent call last):
        File "/workspace/userdata/moelb/Megatron-LM/tests/unit_tests/transformer/moe/xxx.py", line 309, in perf_test
          output, _ = self.model(hidden_states)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
          return self._call_impl(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1762, in _call_impl
          return forward_call(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/workspace/userdata/moelb/Megatron-LM/megatron/core/distributed/data_parallel_base.py", line 22, in forward
          return self.module(*inputs, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
          return self._call_impl(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1857, in _call_impl
          return inner()
                 ^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1805, in inner
          result = forward_call(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/workspace/userdata/moelb/Megatron-LM/megatron/core/transformer/moe/moe_layer.py", line 251, in forward
          output, mlp_bias = custom_forward(hidden_states)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/workspace/userdata/moelb/Megatron-LM/megatron/core/transformer/moe/moe_layer.py", line 217, in custom_forward
          expert_output, mlp_bias = self.experts(
                                    ^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
          return self._call_impl(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1857, in _call_impl
          return inner()
                 ^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1805, in inner
          result = forward_call(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/workspace/userdata/moelb/Megatron-LM/megatron/core/transformer/moe/experts.py", line 780, in forward
          intermediate_parallel, bias_parallel = self.linear_fc1(
                                                 ^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
          return self._call_impl(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1857, in _call_impl
          return inner()
                 ^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/nn/modules/module.py", line 1805, in inner
          result = forward_call(*args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/workspace/userdata/moelb/Megatron-LM/megatron/core/extensions/transformer_engine.py", line 1059, in forward
          out = super().forward(x, m_splits, is_first_microbatch=_is_first_microbatch)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/_dynamo/eval_frame.py", line 749, in _fn
          return fn(*args, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/transformer_engine/pytorch/module/grouped_linear.py", line 654, in forward
          out = linear_fn(*args)
                ^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/torch/autograd/function.py", line 578, in apply
          return super().apply(*args, **kwargs)  # type: ignore[misc]
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/transformer_engine/pytorch/module/grouped_linear.py", line 158, in forward
          _ = general_grouped_gemm(
              ^^^^^^^^^^^^^^^^^^^^^
        File "/usr/local/lib/python3.12/dist-packages/transformer_engine/pytorch/cpp_extensions/gemm.py", line 206, in general_grouped_gemm
          bias = tex.te_general_grouped_gemm(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      RuntimeError: The specified pointer resides on host memory and is not registered with any CUDA device.
      

      分析

      • 從字面意思上看,貌似是我在進行GroupedMLP計算時,有些tensor并不位于GPU上,而是位于CPU上。但我在expert.py內,linear_fc1計算之前,打印了輸入tensor以及參數tensor的device,發現它們都是cuda,而且里面的內容看起來也沒啥毛病。

      • 也許是因為沒有同步?我在前面加入torch.cuda.current_stream().synchronize(),結果依然報錯。我打印了tensor里面的內容,看起來也沒啥問題。

      • 而且離譜的是,這個錯誤只在sequence_length*batch_size達到16384時才出現,8192的時候都沒出現。我也嘗試了很多其他選項(如moe_permute_fusion)是否會造成影響,結果依然都會報錯。

      • 總之我沒轍了,懷疑這是transformer-engine內部的bug。

      問題解決

      出問題時,我使用的環境是nvidia官方的pytorch容器,版本25.03-py3,里面的transformer-engine版本為2.1.0。我將容器版本升級到25.05-py3,里面transformer-engine的版本為2.3.0。報錯就消失了。所以這果然大概率是transformer-engine內部的bug。

      posted @ 2025-06-24 14:50  CQzhangyu  閱讀(177)  評論(4)    收藏  舉報
      主站蜘蛛池模板: 亚洲国产欧美一区二区好看电影| 亚洲全网成人资源在线观看| 亚洲区日韩精品中文字幕| 丁香五月亚洲综合在线国内自拍| 欧美成人精品三级网站| 色av专区无码影音先锋| 不卡乱辈伦在线看中文字幕 | 噶尔县| 国产无遮挡又黄又爽不要vip软件| 大尺度国产一区二区视频 | 亚洲乱妇老熟女爽到高潮的片| 日韩有码中文在线观看| 亚洲中文欧美在线视频| 久久精品无码一区二区小草| 亚洲欧美偷国产日韩| 国产午夜福利精品视频| 国产成人无码AV片在线观看不卡| 人妻少妇偷人一区二区| 亚洲中文字幕无码中字| 色综合色综合综合综合综合 | 人与禽交av在线播放| 亚洲午夜爱爱香蕉片| 高潮潮喷奶水飞溅视频无码| 日韩精品一卡二卡三卡在线 | 亚洲旡码欧美大片| 美女无遮挡免费视频网站| 亚洲精品一区二区毛豆| 兰溪市| 亚洲熟女一区二区av| 特黄aaaaaaaaa毛片免费视频| 久久理论片午夜琪琪电影网| 肥大bbwbbw高潮抽搐| 中文字幕日韩精品东京热| 亚洲伊人久久精品影院| 人妻有码av中文字幕久久琪| 噜噜噜噜私人影院| 制服 丝袜 亚洲 中文 综合| 九九热视频免费在线播放| 通道| 超碰成人精品一区二区三| 人人色在线视频播放|