• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

georgia-tech-db / eva / 9151e506-df33-4c2b-a79d-29b723492654
0%

Build:
DEFAULT BRANCH: master
Ran 26 Jun 2023 03:42PM UTC
Jobs 2
Files 260
Run time 12s
Badge
Embed ▾
README BADGES
x

If you need to use a raster PNG badge, change the '.svg' to '.png' in the link

Markdown

Textile

RDoc

HTML

Rst

pending completion
9151e506-df33-4c2b-a79d-29b723492654

push

circle-ci

web-flow
fix: GPU ids and degree of parallelism (#817)

This PR addresses two issues.

1. We previously only allows parallelism when there is more than two
GPUs available. I think this is too conservative. First, some functional
expressions only run on CPUs and even parallelization of CPU functional
expression can bring performance benefit. Additionally, even if there is
only one GPU available, it still provide performance benefit when
parallelization is enabled because of better GPU utilizations.
Currently, I just hardcode the DOP, but I am thinking to allow users to
configure this themselves through the `eva.yml`. Any feedback on this?
@xzdandy @gaurav274 @jarulraj

2. I realize a bug related to the previous approach. This will happen
for more than 1 GPU case. I will give a concrete example here. Let's
assume we have two GPUs. The GPU id that we get from the `Context` for
the second GPU is `1` because context runs in the main process, which
detects two GPUs on the PyTorch side (e.g., `GPU ID = [0, 1]`. However,
within the Ray process, after we set the environmental variable
`CUDA_VISIBLE_DEVICES=1`, the PyTorch indices is reset. Because it only
accesses one GPU within the Ray process, it only gets GPU ids like `GPU
ID = [0]`. Thus, it causes the no GPU error when it moves to GPU
`to('cuda:1')`. The fix is very easy, that we just simply expose all
GPUs to every Ray process, so we don't need to worry about PyTorch
device id reset. To expose all GPUs, we just find the max GPU id from
the context and expose all GPU ids that are lower than the max in the
environmental variables.

---------

Co-authored-by: xzdandy <xzdandy@gmail.com>

10073 of 10629 relevant lines covered (94.77%)

1.9 hits per line

Jobs
ID Job ID Ran Files Coverage
1 9151e506-df33-4c2b-a79d-29b723492654.1 26 Jun 2023 03:42PM UTC 0
94.77
2 9151e506-df33-4c2b-a79d-29b723492654.2 26 Jun 2023 04:13PM UTC 0
95.0
Source Files on build 9151e506-df33-4c2b-a79d-29b723492654
Detailed source file information is not available for this build.
  • Back to Repo
  • 19c7f09a on github
  • Prev Build on master (#A53AA387...)
  • Next Build on master (#364E8EDD...)
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc