• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

djeedai / bevy_hanabi / 17710478608

14 Sep 2025 11:21AM UTC coverage: 66.279% (+0.2%) from 66.033%
17710478608

push

github

web-flow
Move indirect draw args to separate buffer (#495)

Move the indirect draw args outside of `EffectMetadata`, and into a
separate buffer of their own. This decouples the indirect draw args,
which are largely driven by GPU, from the effect metadata, which are
largely (and ideally, entirely) driven by CPU. The new indirect draw
args buffer stores both indexed and non-indexed draw args, the latter
padded with an extra `u32`. This ensures all entries are the same size
and simplifies handling, but more importantly allows retaining a single
unified dispatch of `vfx_indirect` for all effects without adding any
extra indirection or having to split into two passes.

The main benefit is that this prevents resetting the effect when Bevy
relocates the mesh, which requires re-uploading the mesh location info
into the draw args (base vertex and/or first index, notably), but
otherwise doesn't affect runtime info like the number of particles
alive. Previously when this happened, the entire `EffectMetadata` was
re-uploaded from CPU with default values for GPU-driven fields,
effectively leading to a "reset" of the effect (alive particle reset to
zero), as the warning in #471 used to highlight.

This change also cleans up the shaders by removing the `dead_count`
atomic particle count, and instead adding the constant `capacity`
particle count, which allows deducing the dead particle count from the
existing `alive_count`. This means `alive_count` becomes the only source
of truth for the number of alive particles. This makes several shaders
much more readable, and saves a couple of atomic instructions.

96 of 122 new or added lines in 5 files covered. (78.69%)

4 existing lines in 1 file now uncovered.

4900 of 7393 relevant lines covered (66.28%)

449.04 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

55.09
/src/render/aligned_buffer_vec.rs
1
use std::{num::NonZeroU64, ops::Range};
2

3
use bevy::{
4
    log::trace,
5
    render::{
6
        render_resource::{
7
            BindingResource, Buffer, BufferAddress, BufferBinding, BufferDescriptor, BufferUsages,
8
            ShaderSize, ShaderType,
9
        },
10
        renderer::{RenderDevice, RenderQueue},
11
    },
12
};
13
use bytemuck::{cast_slice, Pod};
14
use copyless::VecHelper;
15

16
/// Like Bevy's [`BufferVec`], but with extra per-item alignment.
17
///
18
/// This helper ensures the individual array elements are properly aligned,
19
/// depending on the device constraints and the WGSL rules. In general using
20
/// [`BufferVec`] is enough to ensure alignment; however when some array items
21
/// also need to be bound individually, then each item (not only the array
22
/// itself) needs to be aligned to the device requirements. This is admittedly a
23
/// very specific case, because the device alignment might be very large (256
24
/// bytes) and this causes a lot of wasted space (padding per-element, instead
25
/// of padding for the entire array).
26
///
27
/// For this buffer to work correctly and items be bindable individually, the
28
/// alignment must come from one of the [`WgpuLimits`]. For example for a
29
/// storage buffer, to be able to bind the entire buffer but also any subset of
30
/// it (including individual elements), the extra alignment must
31
/// be [`WgpuLimits::min_storage_buffer_offset_alignment`].
32
///
33
/// The element type `T` needs to implement the following traits:
34
/// - [`Pod`] to allow copy.
35
/// - [`ShaderType`] because it needs to be mapped for a shader.
36
/// - [`ShaderSize`] to ensure a fixed footprint, to allow packing multiple
37
///   instances inside a single buffer. This therefore excludes any
38
///   runtime-sized array.
39
///
40
/// [`BufferVec`]: bevy::render::render_resource::BufferVec
41
/// [`WgpuLimits`]: bevy::render::settings::WgpuLimits
42
pub struct AlignedBufferVec<T: Pod + ShaderSize> {
43
    /// Pending values accumulated on CPU and not yet written to GPU.
44
    values: Vec<T>,
45
    /// GPU buffer if already allocated, or `None` otherwise.
46
    buffer: Option<Buffer>,
47
    /// Capacity of the buffer, in number of elements.
48
    capacity: usize,
49
    /// Size of a single buffer element, in bytes, in CPU memory (Rust layout).
50
    item_size: usize,
51
    /// Size of a single buffer element, in bytes, aligned to GPU memory
52
    /// constraints.
53
    aligned_size: usize,
54
    /// GPU buffer usages.
55
    buffer_usage: BufferUsages,
56
    /// Optional GPU buffer name, for debugging.
57
    label: Option<String>,
58
}
59

60
impl<T: Pod + ShaderSize> Default for AlignedBufferVec<T> {
61
    fn default() -> Self {
29✔
62
        let item_size = std::mem::size_of::<T>();
58✔
63
        let aligned_size = <T as ShaderSize>::SHADER_SIZE.get() as usize;
58✔
64
        assert!(aligned_size >= item_size);
58✔
65
        Self {
66
            values: Vec::new(),
58✔
67
            buffer: None,
68
            capacity: 0,
69
            buffer_usage: BufferUsages::all(),
58✔
70
            item_size,
71
            aligned_size,
72
            label: None,
73
        }
74
    }
75
}
76

77
impl<T: Pod + ShaderSize> AlignedBufferVec<T> {
78
    /// Create a new collection.
79
    ///
80
    /// `item_align` is an optional additional alignment for items in the
81
    /// collection. If greater than the natural alignment dictated by WGSL
82
    /// rules, this extra alignment is enforced. Otherwise it's ignored (so you
83
    /// can pass `0` to ignore).
84
    ///
85
    /// # Panics
86
    ///
87
    /// Panics if `buffer_usage` contains [`BufferUsages::UNIFORM`] and the
88
    /// layout of the element type `T` does not meet the requirements of the
89
    /// uniform address space, as tested by
90
    /// [`ShaderType::assert_uniform_compat()`].
91
    ///
92
    /// [`BufferUsages::UNIFORM`]: bevy::render::render_resource::BufferUsages::UNIFORM
93
    pub fn new(
29✔
94
        buffer_usage: BufferUsages,
95
        item_align: Option<NonZeroU64>,
96
        label: Option<String>,
97
    ) -> Self {
98
        // GPU-aligned item size, compatible with WGSL rules
99
        let item_size = <T as ShaderSize>::SHADER_SIZE.get() as usize;
58✔
100
        // Extra manual alignment for device constraints
101
        let aligned_size = if let Some(item_align) = item_align {
84✔
102
            let item_align = item_align.get() as usize;
×
103
            let aligned_size = item_size.next_multiple_of(item_align);
×
104
            assert!(aligned_size >= item_size);
×
105
            assert!(aligned_size % item_align == 0);
52✔
106
            aligned_size
26✔
107
        } else {
108
            item_size
3✔
109
        };
110
        trace!(
×
111
            "AlignedBufferVec['{}']: item_size={} aligned_size={}",
4✔
112
            label.as_ref().map(|s| &s[..]).unwrap_or(""),
24✔
113
            item_size,
×
114
            aligned_size
×
115
        );
116
        if buffer_usage.contains(BufferUsages::UNIFORM) {
4✔
117
            <T as ShaderType>::assert_uniform_compat();
4✔
118
        }
119
        Self {
120
            buffer_usage,
121
            aligned_size,
122
            label,
123
            ..Default::default()
124
        }
125
    }
126

127
    fn safe_label(&self) -> &str {
2,032✔
128
        self.label.as_ref().map(|s| &s[..]).unwrap_or("")
12,192✔
129
    }
130

131
    #[inline]
132
    pub fn buffer(&self) -> Option<&Buffer> {
4,073✔
133
        self.buffer.as_ref()
8,146✔
134
    }
135

136
    /// Get a binding for the entire buffer.
137
    #[inline]
138
    #[allow(dead_code)]
139
    pub fn binding(&self) -> Option<BindingResource<'_>> {
×
140
        // FIXME - Return a Buffer wrapper first, which can be unwrapped, then from that
141
        // wrapper implement all the xxx_binding() helpers. That avoids a bunch of "if
142
        // let Some()" everywhere when we know the buffer is valid. The only reason the
143
        // buffer might not be valid is if it was not created, and in that case
144
        // we wouldn't be calling the xxx_bindings() helpers, we'd have earlied out
145
        // before.
146
        let buffer = self.buffer()?;
×
147
        Some(BindingResource::Buffer(BufferBinding {
×
148
            buffer,
×
149
            offset: 0,
×
150
            size: None, // entire buffer
×
151
        }))
152
    }
153

154
    /// Get a binding for a subset of the elements of the buffer.
155
    ///
156
    /// Returns a binding for the elements in the range `offset..offset+count`.
157
    ///
158
    /// # Panics
159
    ///
160
    /// Panics if `count` is zero.
161
    #[inline]
162
    #[allow(dead_code)]
163
    pub fn range_binding(&self, offset: u32, count: u32) -> Option<BindingResource<'_>> {
×
164
        assert!(count > 0);
×
165
        let buffer = self.buffer()?;
×
166
        let offset = self.aligned_size as u64 * offset as u64;
×
167
        let size = NonZeroU64::new(self.aligned_size as u64 * count as u64).unwrap();
×
168
        Some(BindingResource::Buffer(BufferBinding {
×
169
            buffer,
×
170
            offset,
×
171
            size: Some(size),
×
172
        }))
173
    }
174

175
    #[inline]
176
    #[allow(dead_code)]
177
    pub fn capacity(&self) -> usize {
×
178
        self.capacity
×
179
    }
180

181
    #[inline]
182
    pub fn len(&self) -> usize {
4,115✔
183
        self.values.len()
8,230✔
184
    }
185

186
    /// Size in bytes of a single item in the buffer, aligned to the item
187
    /// alignment.
188
    #[inline]
189
    pub fn aligned_size(&self) -> usize {
4,063✔
190
        self.aligned_size
4,063✔
191
    }
192

193
    /// Calculate a dynamic byte offset for a bind group from an array element
194
    /// index.
195
    ///
196
    /// This returns the product of `index` by the internal [`aligned_size()`].
197
    ///
198
    /// # Panic
199
    ///
200
    /// Panics if the `index` is too large, producing a byte offset larger than
201
    /// `u32::MAX`.
202
    ///
203
    /// [`aligned_size()`]: crate::AlignedBufferVec::aligned_size
204
    #[inline]
205
    pub fn dynamic_offset(&self, index: usize) -> u32 {
×
206
        let offset = self.aligned_size * index;
×
207
        assert!(offset <= u32::MAX as usize);
×
208
        u32::try_from(offset).expect("AlignedBufferVec index out of bounds")
×
209
    }
210

211
    #[inline]
212
    #[allow(dead_code)]
213
    pub fn is_empty(&self) -> bool {
3,106✔
214
        self.values.is_empty()
6,212✔
215
    }
216

217
    /// Append a value to the buffer.
218
    ///
219
    /// The content is stored on the CPU and uploaded on the GPU once
220
    /// [`write_buffer()`] is called.
221
    ///
222
    /// [`write_buffer()`]: crate::AlignedBufferVec::write_buffer
223
    pub fn push(&mut self, value: T) -> usize {
1,042✔
224
        let index = self.values.len();
3,126✔
225
        self.values.alloc().init(value);
4,168✔
226
        index
1,042✔
227
    }
228

229
    /// Reserve some capacity into the buffer.
230
    ///
231
    /// If the buffer is reallocated, the old content (on the GPU) is lost, and
232
    /// needs to be re-uploaded to the newly-created buffer. This is done with
233
    /// [`write_buffer()`].
234
    ///
235
    /// # Returns
236
    ///
237
    /// `true` if the buffer was (re)allocated, or `false` if an existing buffer
238
    /// was reused which already had enough capacity.
239
    ///
240
    /// [`write_buffer()`]: crate::AlignedBufferVec::write_buffer
241
    pub fn reserve(&mut self, capacity: usize, device: &RenderDevice) -> bool {
1,019✔
242
        if capacity > self.capacity {
1,019✔
243
            let size = self.aligned_size * capacity;
10✔
244
            trace!(
5✔
245
                "reserve['{}']: increase capacity from {} to {} elements, new size {} bytes",
2✔
246
                self.safe_label(),
4✔
247
                self.capacity,
×
248
                capacity,
×
249
                size
×
250
            );
251
            self.capacity = capacity;
5✔
252
            if let Some(old_buffer) = self.buffer.take() {
6✔
253
                trace!(
×
254
                    "reserve['{}']: destroying old buffer #{:?}",
×
255
                    self.safe_label(),
×
256
                    old_buffer.id()
×
257
                );
258
                old_buffer.destroy();
×
259
            }
260
            let new_buffer = device.create_buffer(&BufferDescriptor {
15✔
261
                label: self.label.as_ref().map(|s| &s[..]),
19✔
262
                size: size as BufferAddress,
5✔
263
                usage: BufferUsages::COPY_DST | self.buffer_usage,
5✔
264
                mapped_at_creation: false,
×
265
            });
266
            trace!(
5✔
267
                "reserve['{}']: created new buffer #{:?}",
2✔
268
                self.safe_label(),
4✔
269
                new_buffer.id(),
4✔
270
            );
271
            self.buffer = Some(new_buffer);
10✔
272
            // FIXME - this discards the old content if any!!!
273
            true
5✔
274
        } else {
275
            false
1,014✔
276
        }
277
    }
278

279
    /// Schedule the buffer write to GPU.
280
    ///
281
    /// # Returns
282
    ///
283
    /// `true` if the buffer was (re)allocated, `false` otherwise.
284
    pub fn write_buffer(&mut self, device: &RenderDevice, queue: &RenderQueue) -> bool {
2,064✔
285
        if self.values.is_empty() {
4,128✔
286
            return false;
1,046✔
287
        }
288
        trace!(
×
289
            "write_buffer['{}']: values.len={} item_size={} aligned_size={}",
1,014✔
290
            self.safe_label(),
2,028✔
291
            self.values.len(),
2,028✔
292
            self.item_size,
×
293
            self.aligned_size
×
294
        );
295
        let buffer_changed = self.reserve(self.values.len(), device);
×
296
        if let Some(buffer) = &self.buffer {
1,018✔
297
            let aligned_size = self.aligned_size * self.values.len();
×
298
            trace!(
×
299
                "aligned_buffer['{}']: size={} buffer={:?}",
1,014✔
300
                self.safe_label(),
2,028✔
301
                aligned_size,
×
302
                buffer.id(),
2,028✔
303
            );
304
            let mut aligned_buffer: Vec<u8> = vec![0; aligned_size];
×
305
            for i in 0..self.values.len() {
1,021✔
306
                let src: &[u8] = cast_slice(std::slice::from_ref(&self.values[i]));
×
307
                let dst_offset = i * self.aligned_size;
×
308
                let dst_range = dst_offset..dst_offset + self.item_size;
×
309
                trace!("+ copy: src={:?} dst={:?}", src.as_ptr(), dst_range);
3,042✔
310
                let dst = &mut aligned_buffer[dst_range];
×
311
                dst.copy_from_slice(src);
×
312
            }
313
            let bytes: &[u8] = cast_slice(&aligned_buffer);
×
314
            queue.write_buffer(buffer, 0, bytes);
×
315
        }
316
        buffer_changed
×
317
    }
318

319
    pub fn clear(&mut self) {
2,063✔
320
        self.values.clear();
4,126✔
321
    }
322
}
323

324
impl<T: Pod + ShaderSize> std::ops::Index<usize> for AlignedBufferVec<T> {
325
    type Output = T;
326

327
    fn index(&self, index: usize) -> &Self::Output {
×
328
        &self.values[index]
×
329
    }
330
}
331

332
impl<T: Pod + ShaderSize> std::ops::IndexMut<usize> for AlignedBufferVec<T> {
333
    fn index_mut(&mut self, index: usize) -> &mut Self::Output {
×
334
        &mut self.values[index]
×
335
    }
336
}
337

338
#[derive(Debug, Clone, PartialEq, Eq)]
339
struct FreeRow(pub Range<u32>);
340

341
impl PartialOrd for FreeRow {
342
    fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
×
343
        Some(self.cmp(other))
×
344
    }
345
}
346

347
impl Ord for FreeRow {
348
    fn cmp(&self, other: &Self) -> std::cmp::Ordering {
6✔
349
        self.0.start.cmp(&other.0.start)
18✔
350
    }
351
}
352

353
/// Like [`AlignedBufferVec`], but for heterogenous data.
354
#[derive(Debug)]
355
pub struct HybridAlignedBufferVec {
356
    /// Pending values accumulated on CPU and not yet written to GPU.
357
    values: Vec<u8>,
358
    /// GPU buffer if already allocated, or `None` otherwise.
359
    buffer: Option<Buffer>,
360
    /// Capacity of the buffer, in bytes.
361
    capacity: usize,
362
    /// Alignment of each element, in bytes.
363
    item_align: usize,
364
    /// GPU buffer usages.
365
    buffer_usage: BufferUsages,
366
    /// Optional GPU buffer name, for debugging.
367
    label: Option<String>,
368
    /// Free ranges available for re-allocation. Those are row ranges; byte
369
    /// ranges are obtained by multiplying these by `item_align`.
370
    free_rows: Vec<FreeRow>,
371
    /// Is the GPU buffer stale and the CPU one need to be re-uploaded?
372
    is_stale: bool,
373
}
374

375
impl HybridAlignedBufferVec {
376
    /// Create a new collection.
377
    ///
378
    /// `item_align` is the alignment for items in the collection.
379
    pub fn new(buffer_usage: BufferUsages, item_align: NonZeroU64, label: Option<String>) -> Self {
5✔
380
        let item_align = item_align.get() as usize;
10✔
381
        trace!(
5✔
382
            "HybridAlignedBufferVec['{}']: item_align={} byte",
3✔
383
            label.as_ref().map(|s| &s[..]).unwrap_or(""),
18✔
384
            item_align,
385
        );
386
        Self {
387
            values: vec![],
10✔
388
            buffer: None,
389
            capacity: 0,
390
            item_align,
391
            buffer_usage,
392
            label,
393
            free_rows: vec![],
5✔
394
            is_stale: true,
395
        }
396
    }
397

398
    #[inline]
399
    pub fn buffer(&self) -> Option<&Buffer> {
2,038✔
400
        self.buffer.as_ref()
4,076✔
401
    }
402

403
    /// Get a binding for the entire buffer.
404
    #[allow(dead_code)]
405
    #[inline]
406
    pub fn max_binding(&self) -> Option<BindingResource<'_>> {
×
407
        // FIXME - Return a Buffer wrapper first, which can be unwrapped, then from that
408
        // wrapper implement all the xxx_binding() helpers. That avoids a bunch of "if
409
        // let Some()" everywhere when we know the buffer is valid. The only reason the
410
        // buffer might not be valid is if it was not created, and in that case
411
        // we wouldn't be calling the xxx_bindings() helpers, we'd have earlied out
412
        // before.
413
        let buffer = self.buffer()?;
×
414
        Some(BindingResource::Buffer(BufferBinding {
×
415
            buffer,
×
416
            offset: 0,
×
417
            size: None, // entire buffer
418
        }))
419
    }
420

421
    /// Get a binding for the first `size` bytes of the buffer.
422
    ///
423
    /// # Panics
424
    ///
425
    /// Panics if `size` is zero.
426
    #[allow(dead_code)]
427
    #[inline]
428
    pub fn lead_binding(&self, size: u32) -> Option<BindingResource<'_>> {
×
429
        let buffer = self.buffer()?;
×
430
        let size = NonZeroU64::new(size as u64).unwrap();
×
431
        Some(BindingResource::Buffer(BufferBinding {
×
432
            buffer,
×
433
            offset: 0,
×
434
            size: Some(size),
×
435
        }))
436
    }
437

438
    /// Get a binding for a subset of the elements of the buffer.
439
    ///
440
    /// Returns a binding for the elements in the range `offset..offset+count`.
441
    ///
442
    /// # Panics
443
    ///
444
    /// Panics if `offset` is not a multiple of the alignment specified on
445
    /// construction.
446
    ///
447
    /// Panics if `size` is zero.
448
    #[allow(dead_code)]
449
    #[inline]
450
    pub fn range_binding(&self, offset: u32, size: u32) -> Option<BindingResource<'_>> {
×
451
        assert!(offset as usize % self.item_align == 0);
×
452
        let buffer = self.buffer()?;
×
453
        let size = NonZeroU64::new(size as u64).unwrap();
×
454
        Some(BindingResource::Buffer(BufferBinding {
×
455
            buffer,
×
456
            offset: offset as u64,
×
457
            size: Some(size),
×
458
        }))
459
    }
460

461
    /// Capacity of the allocated GPU buffer, in bytes.
462
    ///
463
    /// This may be zero if the buffer was not allocated yet. In general, this
464
    /// can differ from the actual data size cached on CPU and waiting to be
465
    /// uploaded to GPU.
466
    #[inline]
467
    #[allow(dead_code)]
468
    pub fn capacity(&self) -> usize {
×
469
        self.capacity
×
470
    }
471

472
    /// Current buffer size, in bytes.
473
    ///
474
    /// This represents the size of the CPU data uploaded to GPU. Pending a GPU
475
    /// buffer re-allocation or re-upload, this size might differ from the
476
    /// actual GPU buffer size. But they're eventually consistent.
477
    #[inline]
478
    pub fn len(&self) -> usize {
1✔
479
        self.values.len()
2✔
480
    }
481

482
    /// Alignment, in bytes, of all the elements.
483
    #[allow(dead_code)]
484
    #[inline]
485
    pub fn item_align(&self) -> usize {
×
486
        self.item_align
×
487
    }
488

489
    /// Calculate a dynamic byte offset for a bind group from an array element
490
    /// index.
491
    ///
492
    /// This returns the product of `index` by the internal [`item_align()`].
493
    ///
494
    /// # Panic
495
    ///
496
    /// Panics if the `index` is too large, producing a byte offset larger than
497
    /// `u32::MAX`.
498
    ///
499
    /// [`item_align()`]: crate::HybridAlignedBufferVec::item_align
500
    #[allow(dead_code)]
501
    #[inline]
502
    pub fn dynamic_offset(&self, index: usize) -> u32 {
×
503
        let offset = self.item_align * index;
×
504
        assert!(offset <= u32::MAX as usize);
×
505
        u32::try_from(offset).expect("HybridAlignedBufferVec index out of bounds")
×
506
    }
507

508
    #[inline]
509
    #[allow(dead_code)]
510
    pub fn is_empty(&self) -> bool {
2,063✔
511
        self.values.is_empty()
4,126✔
512
    }
513

514
    /// Append a value to the buffer.
515
    ///
516
    /// As with [`set_content()`], the content is stored on the CPU and uploaded
517
    /// on the GPU once [`write_buffers()`] is called.
518
    ///
519
    /// # Returns
520
    ///
521
    /// Returns a range starting at the byte offset at which the new element was
522
    /// inserted, which is guaranteed to be a multiple of [`item_align()`].
523
    /// The range span is the item byte size.
524
    ///
525
    /// [`item_align()`]: self::HybridAlignedBufferVec::item_align
526
    pub fn push<T: Pod + ShaderSize>(&mut self, value: &T) -> Range<u32> {
15✔
527
        let src: &[u8] = cast_slice(std::slice::from_ref(value));
60✔
528
        assert_eq!(value.size().get() as usize, src.len());
75✔
529
        self.push_raw(src)
45✔
530
    }
531

532
    /// Append a slice of values to the buffer.
533
    ///
534
    /// The values are assumed to be tightly packed, and will be copied
535
    /// back-to-back into the buffer, without any padding between them. This
536
    /// means that the individul slice items must be properly aligned relative
537
    /// to the beginning of the slice.
538
    ///
539
    /// As with [`set_content()`], the content is stored on the CPU and uploaded
540
    /// on the GPU once [`write_buffers()`] is called.
541
    ///
542
    /// # Returns
543
    ///
544
    /// Returns a range starting at the byte offset at which the new element
545
    /// (the slice) was inserted, which is guaranteed to be a multiple of
546
    /// [`item_align()`]. The range span is the item byte size.
547
    ///
548
    /// # Panics
549
    ///
550
    /// Panics if the byte size of the element `T` is not at least a multiple of
551
    /// the minimum GPU alignment, which is 4 bytes. Note that this doesn't
552
    /// guarantee that the written data is well-formed for use on GPU, as array
553
    /// elements on GPU have other alignment requirements according to WGSL, but
554
    /// at least this catches obvious errors.
555
    ///
556
    /// [`item_align()`]: self::HybridAlignedBufferVec::item_align
557
    #[allow(dead_code)]
558
    pub fn push_many<T: Pod + ShaderSize>(&mut self, value: &[T]) -> Range<u32> {
×
559
        assert_eq!(size_of::<T>() % 4, 0);
×
560
        let src: &[u8] = cast_slice(value);
×
561
        self.push_raw(src)
×
562
    }
563

564
    pub fn push_raw(&mut self, src: &[u8]) -> Range<u32> {
16✔
565
        self.is_stale = true;
16✔
566

567
        // Calculate the number of (aligned) rows to allocate
568
        let num_rows = src.len().div_ceil(self.item_align) as u32;
64✔
569

570
        // Try to find a block of free rows which can accomodate it, and pick the
571
        // smallest one in order to limit wasted space.
572
        let mut best_slot: Option<(u32, usize)> = None;
48✔
573
        for (index, range) in self.free_rows.iter().enumerate() {
32✔
574
            let free_rows = range.0.end - range.0.start;
×
575
            if free_rows >= num_rows {
×
576
                let wasted_rows = free_rows - num_rows;
×
577
                // If we found a slot with the exact size, just use it already
578
                if wasted_rows == 0 {
×
579
                    best_slot = Some((0, index));
×
580
                    break;
581
                }
582
                // Otherwise try to find the smallest oversized slot to reduce wasted space
583
                if let Some(best_slot) = best_slot.as_mut() {
×
584
                    if wasted_rows < best_slot.0 {
×
585
                        *best_slot = (wasted_rows, index);
×
586
                    }
587
                } else {
588
                    best_slot = Some((wasted_rows, index));
×
589
                }
590
            }
591
        }
592

593
        // Insert into existing space
594
        if let Some((_, index)) = best_slot {
16✔
595
            let row_range = self.free_rows.remove(index);
596
            let offset = row_range.0.start as usize * self.item_align;
597
            let free_size = (row_range.0.end - row_range.0.start) as usize * self.item_align;
598
            let size = src.len();
599
            assert!(size <= free_size);
600

601
            let dst = self.values.as_mut_ptr();
×
602
            // SAFETY: dst is guaranteed to point to allocated bytes, which are already
603
            // initialized from a previous call, and are initialized by overwriting the
604
            // bytes with those of a POD type.
605
            #[allow(unsafe_code)]
606
            unsafe {
607
                let dst = dst.add(offset);
×
608
                dst.copy_from_nonoverlapping(src.as_ptr(), size);
×
609
            }
610

611
            let start = offset as u32;
×
612
            let end = start + size as u32;
×
613
            start..end
×
614
        }
615
        // Insert at end of vector, after resizing it
616
        else {
617
            // Calculate new aligned insertion offset and new capacity
618
            let offset = self.values.len().next_multiple_of(self.item_align);
80✔
619
            let size = src.len();
48✔
620
            let new_capacity = offset + size;
32✔
621
            if new_capacity > self.values.capacity() {
32✔
622
                let additional = new_capacity - self.values.len();
15✔
623
                self.values.reserve(additional)
15✔
624
            }
625

626
            // Insert padding if needed
627
            if offset > self.values.len() {
42✔
628
                self.values.resize(offset, 0);
20✔
629
            }
630

631
            // Insert serialized value
632
            // Dealing with safe code via Vec::spare_capacity_mut() is quite difficult
633
            // without the upcoming (unstable) additions to MaybeUninit to deal with arrays.
634
            // To prevent having to loop over individual u8, we use direct pointers instead.
635
            assert!(self.values.capacity() >= offset + size);
64✔
636
            assert_eq!(self.values.len(), offset);
48✔
637
            let dst = self.values.as_mut_ptr();
48✔
638
            // SAFETY: dst is guaranteed to point to allocated (offset+size) bytes, which
639
            // are written by copying a Pod type, so ensures those values are initialized,
640
            // and the final size is set to exactly (offset+size).
641
            #[allow(unsafe_code)]
642
            unsafe {
643
                let dst = dst.add(offset);
80✔
644
                dst.copy_from_nonoverlapping(src.as_ptr(), size);
96✔
645
                self.values.set_len(offset + size);
48✔
646
            }
647

648
            debug_assert_eq!(offset % self.item_align, 0);
32✔
649
            let start = offset as u32;
32✔
650
            let end = start + size as u32;
32✔
651
            start..end
16✔
652
        }
653
    }
654

655
    /// Remove a range of bytes previously added.
656
    ///
657
    /// Remove a range of bytes previously returned by adding one or more
658
    /// elements with [`push()`] or [`push_many()`].
659
    ///
660
    /// # Returns
661
    ///
662
    /// Returns `true` if the range was valid and the corresponding data was
663
    /// removed, or `false` otherwise. In that case, the buffer is not modified.
664
    ///
665
    /// [`push()`]: Self::push
666
    /// [`push_many()`]: Self::push_many
667
    pub fn remove(&mut self, range: Range<u32>) -> bool {
17✔
668
        // Can only remove entire blocks starting at an aligned size
669
        let align = self.item_align as u32;
34✔
670
        if range.start % align != 0 {
17✔
671
            return false;
×
672
        }
673

674
        // Check for out of bounds argument
675
        let end = self.values.len() as u32;
676
        if range.start >= end || range.end > end {
17✔
677
            return false;
×
678
        }
679

680
        // Note: See below, sometimes self.values() has some padding left we couldn't
681
        // recover earlier beause we didn't know the size of this allocation, but we
682
        // need to still deallocate the row here.
683
        if range.end == end || range.end.next_multiple_of(align) == end {
11✔
684
            // If the allocation is at the end of the buffer, shorten the CPU values. This
685
            // ensures is_empty() eventually returns true.
686
            let mut new_row_end = range.start.div_ceil(align);
8✔
687

688
            // Walk the (sorted) free list to also dequeue any range which is now at the end
689
            // of the buffer
690
            while let Some(free_row) = self.free_rows.pop() {
16✔
691
                if free_row.0.end == new_row_end {
4✔
692
                    new_row_end = free_row.0.start;
4✔
693
                } else {
694
                    self.free_rows.push(free_row);
×
695
                    break;
696
                }
697
            }
698

699
            // Note: we can't really recover any padding here because we don't know the
700
            // exact size of that allocation, only its row-aligned size.
701
            self.values.truncate((new_row_end * align) as usize);
702
        } else {
703
            // Otherwise, save the row into the free list.
704
            let start = range.start / align;
9✔
705
            let end = range.end.div_ceil(align);
706
            let free_row = FreeRow(start..end);
707

708
            // Insert as sorted
709
            if self.free_rows.is_empty() {
4✔
710
                // Special case to simplify below, and to avoid binary_search()
711
                self.free_rows.push(free_row);
8✔
712
            } else if let Err(index) = self.free_rows.binary_search(&free_row) {
13✔
713
                if index >= self.free_rows.len() {
714
                    // insert at end
715
                    let prev = self.free_rows.last_mut().unwrap(); // known
3✔
716
                    if prev.0.end == free_row.0.start {
2✔
717
                        // merge with last value
718
                        prev.0.end = free_row.0.end;
1✔
719
                    } else {
720
                        // insert last, with gap
721
                        self.free_rows.push(free_row);
×
722
                    }
723
                } else if index == 0 {
3✔
724
                    // insert at start
725
                    let next = &mut self.free_rows[0];
4✔
726
                    if free_row.0.end == next.0.start {
3✔
727
                        // merge with next
728
                        next.0.start = free_row.0.start;
1✔
729
                    } else {
730
                        // insert first, with gap
731
                        self.free_rows.insert(0, free_row);
1✔
732
                    }
733
                } else {
734
                    // insert between 2 existing elements
735
                    let prev = &mut self.free_rows[index - 1];
1✔
736
                    if prev.0.end == free_row.0.start {
737
                        // merge with previous value
738
                        prev.0.end = free_row.0.end;
1✔
739

740
                        let prev = self.free_rows[index - 1].clone();
3✔
741
                        let next = &mut self.free_rows[index];
2✔
742
                        if prev.0.end == next.0.start {
2✔
743
                            // also merge prev with next, and remove prev
744
                            next.0.start = prev.0.start;
2✔
745
                            self.free_rows.remove(index - 1);
2✔
746
                        }
747
                    } else {
748
                        let next = &mut self.free_rows[index];
×
749
                        if free_row.0.end == next.0.start {
×
750
                            // merge with next value
751
                            next.0.start = free_row.0.start;
×
752
                        } else {
753
                            // insert between 2 values, with gaps on both sides
754
                            self.free_rows.insert(0, free_row);
×
755
                        }
756
                    }
757
                }
758
            } else {
759
                // The range exists in the free list, this means it's already removed. This is a
760
                // duplicate; ignore it.
761
                return false;
1✔
762
            }
763
        }
764
        self.is_stale = true;
16✔
765
        true
766
    }
767

768
    /// Update an allocated entry with a new value.
769
    #[inline]
NEW
770
    pub fn update<T: Pod + ShaderSize>(&mut self, offset: u32, value: &T) {
×
NEW
771
        let data: &[u8] = cast_slice(std::slice::from_ref(value));
×
NEW
772
        assert_eq!(value.size().get() as usize, data.len());
×
NEW
773
        self.update_raw(offset, data);
×
774
    }
775

776
    /// Update an allocated entry with new data.
NEW
777
    pub fn update_raw(&mut self, offset: u32, data: &[u8]) {
×
778
        // Can only update entire blocks starting at an aligned size
779
        let align = self.item_align as u32;
×
780
        if offset % align != 0 {
×
781
            return;
×
782
        }
783

784
        // Check for out of bounds argument
785
        let end = self.values.len() as u32;
786
        let data_end = offset + data.len() as u32;
787
        if offset >= end || data_end > end {
×
788
            return;
×
789
        }
790

791
        let dst: &mut [u8] = &mut self.values[offset as usize..data_end as usize];
792
        dst.copy_from_slice(data);
793

794
        self.is_stale = true;
795
    }
796

797
    /// Reserve some capacity into the buffer.
798
    ///
799
    /// If the buffer is reallocated, the old content (on the GPU) is lost, and
800
    /// needs to be re-uploaded to the newly-created buffer. This is done with
801
    /// [`write_buffer()`].
802
    ///
803
    /// # Returns
804
    ///
805
    /// `true` if the buffer was (re)allocated, or `false` if an existing buffer
806
    /// was reused which already had enough capacity.
807
    ///
808
    /// [`write_buffer()`]: crate::AlignedBufferVec::write_buffer
809
    pub fn reserve(&mut self, capacity: usize, device: &RenderDevice) -> bool {
1✔
810
        if capacity > self.capacity {
1✔
811
            trace!(
1✔
812
                "reserve: increase capacity from {} to {} bytes",
1✔
813
                self.capacity,
814
                capacity,
815
            );
816
            self.capacity = capacity;
1✔
817
            if let Some(buffer) = self.buffer.take() {
1✔
818
                buffer.destroy();
819
            }
820
            self.buffer = Some(device.create_buffer(&BufferDescriptor {
3✔
821
                label: self.label.as_ref().map(|s| &s[..]),
4✔
822
                size: capacity as BufferAddress,
1✔
823
                usage: BufferUsages::COPY_DST | self.buffer_usage,
1✔
824
                mapped_at_creation: false,
825
            }));
826
            self.is_stale = !self.values.is_empty();
1✔
827
            // FIXME - this discards the old content if any!!!
828
            true
1✔
829
        } else {
830
            false
×
831
        }
832
    }
833

834
    /// Schedule the buffer write to GPU.
835
    ///
836
    /// # Returns
837
    ///
838
    /// `true` if the buffer was (re)allocated, `false` otherwise. If the buffer
839
    /// was reallocated, all bind groups referencing the old buffer should be
840
    /// destroyed.
841
    pub fn write_buffer(&mut self, device: &RenderDevice, queue: &RenderQueue) -> bool {
1,050✔
842
        if self.values.is_empty() || !self.is_stale {
2,109✔
843
            return false;
1,049✔
844
        }
845
        let size = self.values.len();
3✔
846
        trace!(
1✔
847
            "hybrid abv: write_buffer: size={}B item_align={}B",
1✔
848
            size,
849
            self.item_align,
850
        );
851
        let buffer_changed = self.reserve(size, device);
5✔
852
        if let Some(buffer) = &self.buffer {
2✔
853
            queue.write_buffer(buffer, 0, self.values.as_slice());
854
            self.is_stale = false;
855
        }
856
        buffer_changed
1✔
857
    }
858

859
    #[allow(dead_code)]
860
    pub fn clear(&mut self) {
×
861
        if !self.values.is_empty() {
×
862
            self.is_stale = true;
×
863
        }
864
        self.values.clear();
×
865
    }
866
}
867

868
#[cfg(test)]
869
mod tests {
870
    use std::num::NonZeroU64;
871

872
    use bevy::math::Vec3;
873
    use bytemuck::{Pod, Zeroable};
874

875
    use super::*;
876

877
    #[repr(C)]
878
    #[derive(Debug, Default, Clone, Copy, Pod, Zeroable, ShaderType)]
879
    pub(crate) struct GpuDummy {
880
        pub v: Vec3,
881
    }
882

883
    #[repr(C)]
884
    #[derive(Debug, Default, Clone, Copy, Pod, Zeroable, ShaderType)]
885
    pub(crate) struct GpuDummyComposed {
886
        pub simple: GpuDummy,
887
        pub tag: u32,
888
        // GPU padding to 16 bytes due to GpuDummy forcing align to 16 bytes
889
    }
890

891
    #[repr(C)]
892
    #[derive(Debug, Clone, Copy, Pod, Zeroable, ShaderType)]
893
    pub(crate) struct GpuDummyLarge {
894
        pub simple: GpuDummy,
895
        pub tag: u32,
896
        pub large: [f32; 128],
897
    }
898

899
    #[test]
900
    fn abv_sizes() {
901
        // Rust
902
        assert_eq!(std::mem::size_of::<GpuDummy>(), 12);
903
        assert_eq!(std::mem::align_of::<GpuDummy>(), 4);
904
        assert_eq!(std::mem::size_of::<GpuDummyComposed>(), 16); // tight packing
905
        assert_eq!(std::mem::align_of::<GpuDummyComposed>(), 4);
906
        assert_eq!(std::mem::size_of::<GpuDummyLarge>(), 132 * 4); // tight packing
907
        assert_eq!(std::mem::align_of::<GpuDummyLarge>(), 4);
908

909
        // GPU
910
        assert_eq!(<GpuDummy as ShaderType>::min_size().get(), 16); // Vec3 gets padded to 16 bytes
911
        assert_eq!(<GpuDummy as ShaderSize>::SHADER_SIZE.get(), 16);
912
        assert_eq!(<GpuDummyComposed as ShaderType>::min_size().get(), 32); // align is 16 bytes, forces padding
913
        assert_eq!(<GpuDummyComposed as ShaderSize>::SHADER_SIZE.get(), 32);
914
        assert_eq!(<GpuDummyLarge as ShaderType>::min_size().get(), 544); // align is 16 bytes, forces padding
915
        assert_eq!(<GpuDummyLarge as ShaderSize>::SHADER_SIZE.get(), 544);
916

917
        for (item_align, expected_aligned_size) in [
918
            (0, 16),
919
            (4, 16),
920
            (8, 16),
921
            (16, 16),
922
            (32, 32),
923
            (256, 256),
924
            (512, 512),
925
        ] {
926
            let mut abv = AlignedBufferVec::<GpuDummy>::new(
927
                BufferUsages::STORAGE,
928
                NonZeroU64::new(item_align),
929
                None,
930
            );
931
            assert_eq!(abv.aligned_size(), expected_aligned_size);
932
            assert!(abv.is_empty());
933
            abv.push(GpuDummy::default());
934
            assert!(!abv.is_empty());
935
            assert_eq!(abv.len(), 1);
936
        }
937

938
        for (item_align, expected_aligned_size) in [
939
            (0, 32),
940
            (4, 32),
941
            (8, 32),
942
            (16, 32),
943
            (32, 32),
944
            (256, 256),
945
            (512, 512),
946
        ] {
947
            let mut abv = AlignedBufferVec::<GpuDummyComposed>::new(
948
                BufferUsages::STORAGE,
949
                NonZeroU64::new(item_align),
950
                None,
951
            );
952
            assert_eq!(abv.aligned_size(), expected_aligned_size);
953
            assert!(abv.is_empty());
954
            abv.push(GpuDummyComposed::default());
955
            assert!(!abv.is_empty());
956
            assert_eq!(abv.len(), 1);
957
        }
958

959
        for (item_align, expected_aligned_size) in [
960
            (0, 544),
961
            (4, 544),
962
            (8, 544),
963
            (16, 544),
964
            (32, 544),
965
            (256, 768),
966
            (512, 1024),
967
        ] {
968
            let mut abv = AlignedBufferVec::<GpuDummyLarge>::new(
969
                BufferUsages::STORAGE,
970
                NonZeroU64::new(item_align),
971
                None,
972
            );
973
            assert_eq!(abv.aligned_size(), expected_aligned_size);
974
            assert!(abv.is_empty());
975
            abv.push(GpuDummyLarge {
976
                simple: Default::default(),
977
                tag: 0,
978
                large: [0.; 128],
979
            });
980
            assert!(!abv.is_empty());
981
            assert_eq!(abv.len(), 1);
982
        }
983
    }
984

985
    #[test]
986
    fn habv_remove() {
987
        let mut habv =
988
            HybridAlignedBufferVec::new(BufferUsages::STORAGE, NonZeroU64::new(32).unwrap(), None);
989
        assert!(habv.is_empty());
990
        assert_eq!(habv.item_align, 32);
991

992
        // +r -r
993
        {
994
            let r = habv.push(&42u32);
995
            assert_eq!(r, 0..4);
996
            assert!(!habv.is_empty());
997
            assert_eq!(habv.values.len(), 4);
998
            assert!(habv.free_rows.is_empty());
999

1000
            assert!(habv.remove(r));
1001
            assert!(habv.is_empty());
1002
            assert!(habv.values.is_empty());
1003
            assert!(habv.free_rows.is_empty());
1004
        }
1005

1006
        // +r0 +r1 +r2 -r0 -r0 -r1 -r2
1007
        {
1008
            let r0 = habv.push(&42u32);
1009
            let r1 = habv.push(&84u32);
1010
            let r2 = habv.push(&84u32);
1011
            assert_eq!(r0, 0..4);
1012
            assert_eq!(r1, 32..36);
1013
            assert_eq!(r2, 64..68);
1014
            assert!(!habv.is_empty());
1015
            assert_eq!(habv.values.len(), 68);
1016
            assert!(habv.free_rows.is_empty());
1017

1018
            assert!(habv.remove(r0.clone()));
1019
            assert!(!habv.is_empty());
1020
            assert_eq!(habv.values.len(), 68);
1021
            assert_eq!(habv.free_rows.len(), 1);
1022
            assert_eq!(habv.free_rows[0], FreeRow(0..1));
1023

1024
            // dupe; no-op
1025
            assert!(!habv.remove(r0));
1026

1027
            assert!(habv.remove(r1.clone()));
1028
            assert!(!habv.is_empty());
1029
            assert_eq!(habv.values.len(), 68);
1030
            assert_eq!(habv.free_rows.len(), 1); // merged!
1031
            assert_eq!(habv.free_rows[0], FreeRow(0..2));
1032

1033
            assert!(habv.remove(r2));
1034
            assert!(habv.is_empty());
1035
            assert_eq!(habv.values.len(), 0);
1036
            assert!(habv.free_rows.is_empty());
1037
        }
1038

1039
        // +r0 +r1 +r2 -r1 -r0 -r2
1040
        {
1041
            let r0 = habv.push(&42u32);
1042
            let r1 = habv.push(&84u32);
1043
            let r2 = habv.push(&84u32);
1044
            assert_eq!(r0, 0..4);
1045
            assert_eq!(r1, 32..36);
1046
            assert_eq!(r2, 64..68);
1047
            assert!(!habv.is_empty());
1048
            assert_eq!(habv.values.len(), 68);
1049
            assert!(habv.free_rows.is_empty());
1050

1051
            assert!(habv.remove(r1.clone()));
1052
            assert!(!habv.is_empty());
1053
            assert_eq!(habv.values.len(), 68);
1054
            assert_eq!(habv.free_rows.len(), 1);
1055
            assert_eq!(habv.free_rows[0], FreeRow(1..2));
1056

1057
            assert!(habv.remove(r0.clone()));
1058
            assert!(!habv.is_empty());
1059
            assert_eq!(habv.values.len(), 68);
1060
            assert_eq!(habv.free_rows.len(), 1); // merged!
1061
            assert_eq!(habv.free_rows[0], FreeRow(0..2));
1062

1063
            assert!(habv.remove(r2));
1064
            assert!(habv.is_empty());
1065
            assert_eq!(habv.values.len(), 0);
1066
            assert!(habv.free_rows.is_empty());
1067
        }
1068

1069
        // +r0 +r1 +r2 -r1 -r2 -r0
1070
        {
1071
            let r0 = habv.push(&42u32);
1072
            let r1 = habv.push(&84u32);
1073
            let r2 = habv.push(&84u32);
1074
            assert_eq!(r0, 0..4);
1075
            assert_eq!(r1, 32..36);
1076
            assert_eq!(r2, 64..68);
1077
            assert!(!habv.is_empty());
1078
            assert_eq!(habv.values.len(), 68);
1079
            assert!(habv.free_rows.is_empty());
1080

1081
            assert!(habv.remove(r1.clone()));
1082
            assert!(!habv.is_empty());
1083
            assert_eq!(habv.values.len(), 68);
1084
            assert_eq!(habv.free_rows.len(), 1);
1085
            assert_eq!(habv.free_rows[0], FreeRow(1..2));
1086

1087
            assert!(habv.remove(r2.clone()));
1088
            assert!(!habv.is_empty());
1089
            assert_eq!(habv.values.len(), 32); // can't recover exact alloc (4), only row-aligned size (32)
1090
            assert!(habv.free_rows.is_empty()); // merged!
1091

1092
            assert!(habv.remove(r0));
1093
            assert!(habv.is_empty());
1094
            assert_eq!(habv.values.len(), 0);
1095
            assert!(habv.free_rows.is_empty());
1096
        }
1097

1098
        // +r0 +r1 +r2 +r3 +r4 -r3 -r1 -r2 -r4 r0
1099
        {
1100
            let r0 = habv.push(&42u32);
1101
            let r1 = habv.push(&84u32);
1102
            let r2 = habv.push(&84u32);
1103
            let r3 = habv.push(&84u32);
1104
            let r4 = habv.push(&84u32);
1105
            assert_eq!(r0, 0..4);
1106
            assert_eq!(r1, 32..36);
1107
            assert_eq!(r2, 64..68);
1108
            assert_eq!(r3, 96..100);
1109
            assert_eq!(r4, 128..132);
1110
            assert!(!habv.is_empty());
1111
            assert_eq!(habv.values.len(), 132);
1112
            assert!(habv.free_rows.is_empty());
1113

1114
            assert!(habv.remove(r3.clone()));
1115
            assert!(!habv.is_empty());
1116
            assert_eq!(habv.values.len(), 132);
1117
            assert_eq!(habv.free_rows.len(), 1);
1118
            assert_eq!(habv.free_rows[0], FreeRow(3..4));
1119

1120
            assert!(habv.remove(r1.clone()));
1121
            assert!(!habv.is_empty());
1122
            assert_eq!(habv.values.len(), 132);
1123
            assert_eq!(habv.free_rows.len(), 2);
1124
            assert_eq!(habv.free_rows[0], FreeRow(1..2)); // sorted!
1125
            assert_eq!(habv.free_rows[1], FreeRow(3..4));
1126

1127
            assert!(habv.remove(r2.clone()));
1128
            assert!(!habv.is_empty());
1129
            assert_eq!(habv.values.len(), 132);
1130
            assert_eq!(habv.free_rows.len(), 1); // merged!
1131
            assert_eq!(habv.free_rows[0], FreeRow(1..4)); // merged!
1132

1133
            assert!(habv.remove(r4.clone()));
1134
            assert!(!habv.is_empty());
1135
            assert_eq!(habv.values.len(), 32); // can't recover exact alloc (4), only row-aligned size (32)
1136
            assert!(habv.free_rows.is_empty());
1137

1138
            assert!(habv.remove(r0));
1139
            assert!(habv.is_empty());
1140
            assert_eq!(habv.values.len(), 0);
1141
            assert!(habv.free_rows.is_empty());
1142
        }
1143
    }
1144
}
1145

1146
#[cfg(all(test, feature = "gpu_tests"))]
1147
mod gpu_tests {
1148
    use tests::*;
1149

1150
    use super::*;
1151
    use crate::test_utils::MockRenderer;
1152

1153
    #[test]
1154
    fn abv_write() {
1155
        let renderer = MockRenderer::new();
1156
        let device = renderer.device();
1157
        let queue = renderer.queue();
1158

1159
        // Create a dummy CommandBuffer to force the write_buffer() call to have any
1160
        // effect.
1161
        let encoder = device.create_command_encoder(&wgpu::CommandEncoderDescriptor {
1162
            label: Some("test"),
1163
        });
1164
        let command_buffer = encoder.finish();
1165

1166
        let item_align = device.limits().min_storage_buffer_offset_alignment as u64;
1167
        let mut abv = AlignedBufferVec::<GpuDummyComposed>::new(
1168
            BufferUsages::STORAGE | BufferUsages::MAP_READ,
1169
            NonZeroU64::new(item_align),
1170
            None,
1171
        );
1172
        let final_align = item_align.max(<GpuDummyComposed as ShaderSize>::SHADER_SIZE.get());
1173
        assert_eq!(abv.aligned_size(), final_align as usize);
1174

1175
        const CAPACITY: usize = 42;
1176

1177
        // Write buffer (CPU -> GPU)
1178
        abv.push(GpuDummyComposed {
1179
            tag: 1,
1180
            ..Default::default()
1181
        });
1182
        abv.push(GpuDummyComposed {
1183
            tag: 2,
1184
            ..Default::default()
1185
        });
1186
        abv.push(GpuDummyComposed {
1187
            tag: 3,
1188
            ..Default::default()
1189
        });
1190
        abv.reserve(CAPACITY, &device);
1191
        abv.write_buffer(&device, &queue);
1192
        // need a submit() for write_buffer() to be processed
1193
        queue.submit([command_buffer]);
1194
        let (tx, rx) = futures::channel::oneshot::channel();
1195
        queue.on_submitted_work_done(move || {
1196
            tx.send(()).unwrap();
1197
        });
1198
        device.poll(wgpu::Maintain::Wait);
1199
        let _ = futures::executor::block_on(rx);
1200
        println!("Buffer written");
1201

1202
        // Read back (GPU -> CPU)
1203
        let buffer = abv.buffer();
1204
        let buffer = buffer.as_ref().expect("Buffer was not allocated");
1205
        let buffer = buffer.slice(..);
1206
        let (tx, rx) = futures::channel::oneshot::channel();
1207
        buffer.map_async(wgpu::MapMode::Read, move |result| {
1208
            tx.send(result).unwrap();
1209
        });
1210
        device.poll(wgpu::Maintain::Wait);
1211
        let _result = futures::executor::block_on(rx);
1212
        let view = buffer.get_mapped_range();
1213

1214
        // Validate content
1215
        assert_eq!(view.len(), final_align as usize * CAPACITY);
1216
        for i in 0..3 {
1217
            let offset = i * final_align as usize;
1218
            let dummy_composed: &[GpuDummyComposed] =
1219
                cast_slice(&view[offset..offset + std::mem::size_of::<GpuDummyComposed>()]);
1220
            assert_eq!(dummy_composed[0].tag, (i + 1) as u32);
1221
        }
1222
    }
1223
}
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc