• Home
  • Features
  • Pricing
  • Docs
  • Announcements
  • Sign In

joaoh82 / rust_sqlite / 24983496245

27 Apr 2026 08:01AM UTC coverage: 67.574% (+0.7%) from 66.922%
24983496245

push

github

web-flow
Phase 7a: VECTOR(N) column type (storage only) (#42)

* Phase 7a: VECTOR(N) column type — storage only

First sub-phase of the AI-era extensions. Adds a fixed-dimension
dense f32 array as a first-class SQL data type, with the cell
encoding + parser plumbing it needs. No distance functions or
similarity search yet — those land in 7b/7c/7d on top of this.

What works:

  CREATE TABLE docs (id INTEGER PRIMARY KEY, embedding VECTOR(384));
  INSERT INTO docs (embedding) VALUES ([0.1, 0.2, ..., 0.0]);
  SELECT * FROM docs;

Roundtrips through save / open. Wrong-dimension INSERTs are rejected
with a clean typed error. Empty / non-positive / non-numeric
dimensions in CREATE TABLE are rejected with descriptive messages.

**File format bumped to v4** (see docs/file-format.md). Per the
Phase 7 plan's Q8 decision, all Phase 7 storage additions
(VECTOR + JSON + HNSW indexes coming next) live inside this
single v4 envelope — no v5 mid-Phase-7. Old v3 files reject on
open with the standard "unsupported format version" error.

**Engine plumbing** (~300 LOC + 22 new tests, 184 total now passing):

  - DataType::Vector(usize) + parser handlers in DataType::new
    that round-trip through the existing string-based ParsedColumn
    pipeline (encoded as `vector(N)` in the wire string).
  - Value::Vector(Vec<f32>) at the runtime layer.
  - Row::Vector(BTreeMap<i64, Vec<f32>>) at the storage layer.
  - All 6 existing match arms in table.rs / cell.rs / executor.rs
    / pager/mod.rs extended with Vector cases.
  - Cell value tag 0x04 = VECTOR. Wire layout: tag (1 byte) +
    dim (varint) + dim×4 bytes f32 little-endian. Self-describing
    so decode_value works without schema context.

**Parser** (Q6 + Q7 decisions baked in):

  - CREATE TABLE: sqlparser parses `VECTOR(N)` as DataType::Custom
    with a Vec<String> of args. New is_vector_type() + parse_vector_dim()
    helpers in parser/create.rs translate to internal `vector(N)`
    string.
  - INSERT VALUES: sqlparser sees... (continued)

246 of 313 new or added lines in 7 files covered. (78.59%)

7 existing lines in 1 file now uncovered.

4247 of 6285 relevant lines covered (67.57%)

1.24 hits per line

Source File
Press 'n' to go to next uncovered line, 'b' for previous

68.78
/src/sql/db/table.rs
1
use crate::error::{Result, SQLRiteError};
2
use crate::sql::db::secondary_index::{IndexOrigin, SecondaryIndex};
3
use crate::sql::parser::create::CreateQuery;
4
use std::collections::{BTreeMap, HashMap};
5
use std::fmt;
6
use std::sync::{Arc, Mutex};
7

8
use prettytable::{Cell as PrintCell, Row as PrintRow, Table as PrintTable};
9

10
/// SQLRite data types
11
/// Mapped after SQLite Data Type Storage Classes and SQLite Affinity Type
12
/// (Datatypes In SQLite Version 3)[https://www.sqlite.org/datatype3.html]
13
///
14
/// `Vector(dim)` is the Phase 7a addition — a fixed-dimension dense f32
15
/// array. The dimension is part of the type so a `VECTOR(384)` column
16
/// rejects `[0.1, 0.2, 0.3]` at INSERT time as a clean type error
17
/// rather than silently storing the wrong shape.
18
#[derive(PartialEq, Debug, Clone)]
19
pub enum DataType {
20
    Integer,
21
    Text,
22
    Real,
23
    Bool,
24
    /// Dense f32 vector of fixed dimension. The `usize` is the column's
25
    /// declared dimension; every value stored in the column must have
26
    /// exactly that many elements.
27
    Vector(usize),
28
    None,
29
    Invalid,
30
}
31

32
impl DataType {
33
    /// Constructs a `DataType` from the wire string the parser produces.
34
    /// Pre-Phase-7 the strings were one-of `"integer" | "text" | "real" |
35
    /// "bool" | "none"`. Phase 7a adds `"vector(N)"` (case-insensitive,
36
    /// N a positive integer) for the new vector column type — encoded
37
    /// in-band so we don't have to plumb a richer type through the
38
    /// existing string-based ParsedColumn pipeline.
39
    pub fn new(cmd: String) -> DataType {
2✔
40
        let lower = cmd.to_lowercase();
4✔
41
        match lower.as_str() {
4✔
42
            "integer" => DataType::Integer,
4✔
43
            "text" => DataType::Text,
6✔
44
            "real" => DataType::Real,
3✔
45
            "bool" => DataType::Bool,
2✔
46
            "none" => DataType::None,
2✔
47
            other if other.starts_with("vector(") && other.ends_with(')') => {
3✔
48
                // Strip the `vector(` prefix and trailing `)`, parse what's
49
                // left as a positive integer dimension. Anything else is
50
                // Invalid — surfaces a clean error at CREATE TABLE time.
51
                let inside = &other["vector(".len()..other.len() - 1];
2✔
52
                match inside.trim().parse::<usize>() {
1✔
53
                    Ok(dim) if dim > 0 => DataType::Vector(dim),
1✔
NEW
54
                    _ => {
×
55
                        eprintln!("Invalid VECTOR dimension in {cmd}");
2✔
56
                        DataType::Invalid
1✔
57
                    }
58
                }
59
            }
UNCOV
60
            _ => {
×
61
                eprintln!("Invalid data type given {}", cmd);
2✔
62
                DataType::Invalid
1✔
63
            }
64
        }
65
    }
66

67
    /// Inverse of `new` — returns the canonical lowercased wire string
68
    /// for this DataType. Used by the parser to round-trip
69
    /// `VECTOR(N)` → `DataType::Vector(N)` → `"vector(N)"` into
70
    /// `ParsedColumn::datatype` so the rest of the pipeline keeps
71
    /// working with strings.
72
    pub fn to_wire_string(&self) -> String {
1✔
73
        match self {
1✔
NEW
74
            DataType::Integer => "Integer".to_string(),
×
NEW
75
            DataType::Text => "Text".to_string(),
×
NEW
76
            DataType::Real => "Real".to_string(),
×
NEW
77
            DataType::Bool => "Bool".to_string(),
×
78
            DataType::Vector(dim) => format!("vector({dim})"),
1✔
NEW
79
            DataType::None => "None".to_string(),
×
NEW
80
            DataType::Invalid => "Invalid".to_string(),
×
81
        }
82
    }
83
}
84

85
impl fmt::Display for DataType {
86
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
2✔
87
        match self {
2✔
88
            DataType::Integer => f.write_str("Integer"),
2✔
89
            DataType::Text => f.write_str("Text"),
2✔
90
            DataType::Real => f.write_str("Real"),
1✔
91
            DataType::Bool => f.write_str("Boolean"),
1✔
92
            DataType::Vector(dim) => write!(f, "Vector({dim})"),
1✔
93
            DataType::None => f.write_str("None"),
1✔
94
            DataType::Invalid => f.write_str("Invalid"),
1✔
95
        }
96
    }
97
}
98

99
/// The schema for each SQL Table is represented in memory by
100
/// following structure.
101
///
102
/// `rows` is `Arc<Mutex<...>>` rather than `Rc<RefCell<...>>` so `Table`
103
/// (and by extension `Database`) is `Send + Sync` — the Tauri desktop
104
/// app holds the engine in shared state behind a `Mutex<Database>`, and
105
/// Tauri's state container requires its contents to be thread-safe.
106
#[derive(Debug)]
107
pub struct Table {
108
    /// Name of the table
109
    pub tb_name: String,
110
    /// Schema for each column, in declaration order.
111
    pub columns: Vec<Column>,
112
    /// Per-column row storage, keyed by column name. Every column's
113
    /// `Row::T(BTreeMap)` is keyed by rowid, so all columns share the same
114
    /// keyset after each write.
115
    pub rows: Arc<Mutex<HashMap<String, Row>>>,
116
    /// Secondary indexes on this table (Phase 3e). One auto-created entry
117
    /// per UNIQUE or PRIMARY KEY column; explicit `CREATE INDEX` statements
118
    /// add more. Looking up an index: iterate by column name, or by index
119
    /// name via `Table::index_by_name`.
120
    pub secondary_indexes: Vec<SecondaryIndex>,
121
    /// ROWID of most recent insert.
122
    pub last_rowid: i64,
123
    /// PRIMARY KEY column name, or "-1" if the table has no PRIMARY KEY.
124
    pub primary_key: String,
125
}
126

127
impl Table {
128
    pub fn new(create_query: CreateQuery) -> Self {
2✔
129
        let table_name = create_query.table_name;
2✔
130
        let mut primary_key: String = String::from("-1");
2✔
131
        let columns = create_query.columns;
2✔
132

133
        let mut table_cols: Vec<Column> = vec![];
2✔
134
        let table_rows: Arc<Mutex<HashMap<String, Row>>> = Arc::new(Mutex::new(HashMap::new()));
4✔
135
        let mut secondary_indexes: Vec<SecondaryIndex> = Vec::new();
2✔
136
        for col in &columns {
6✔
137
            let col_name = &col.name;
2✔
138
            if col.is_pk {
4✔
139
                primary_key = col_name.to_string();
2✔
140
            }
141
            table_cols.push(Column::new(
4✔
142
                col_name.to_string(),
4✔
143
                col.datatype.to_string(),
2✔
144
                col.is_pk,
2✔
145
                col.not_null,
2✔
146
                col.is_unique,
2✔
147
            ));
148

149
            let dt = DataType::new(col.datatype.to_string());
2✔
150
            let row_storage = match &dt {
2✔
151
                DataType::Integer => Row::Integer(BTreeMap::new()),
4✔
152
                DataType::Real => Row::Real(BTreeMap::new()),
2✔
153
                DataType::Text => Row::Text(BTreeMap::new()),
4✔
154
                DataType::Bool => Row::Bool(BTreeMap::new()),
×
155
                // The dimension is enforced at INSERT time against the
156
                // column's declared DataType::Vector(dim). The Row variant
157
                // itself doesn't carry the dim — every stored Vec<f32>
158
                // already has it via .len().
159
                DataType::Vector(_dim) => Row::Vector(BTreeMap::new()),
2✔
160
                DataType::Invalid | DataType::None => Row::None,
1✔
161
            };
162
            table_rows
4✔
163
                .lock()
164
                .expect("Table row storage mutex poisoned")
165
                .insert(col.name.to_string(), row_storage);
×
166

167
            // Auto-create an index for every UNIQUE / PRIMARY KEY column,
168
            // but only for types we know how to index. Real / Bool / Vector
169
            // UNIQUE columns fall back to the linear scan path in
170
            // validate_unique_constraint — same behavior as before 3e.
171
            // (Vector UNIQUE is unusual; the linear-scan path will work
172
            // via Value::Vector PartialEq, just at O(N) cost.)
173
            if (col.is_pk || col.is_unique) && matches!(dt, DataType::Integer | DataType::Text) {
2✔
174
                let name = SecondaryIndex::auto_name(&table_name, &col.name);
2✔
175
                match SecondaryIndex::new(
4✔
176
                    name,
2✔
177
                    table_name.clone(),
4✔
178
                    col.name.clone(),
2✔
179
                    &dt,
×
180
                    true,
×
181
                    IndexOrigin::Auto,
×
182
                ) {
183
                    Ok(idx) => secondary_indexes.push(idx),
4✔
184
                    Err(_) => {
×
185
                        // Unreachable given the matches! guard above, but
186
                        // the builder returns Result so we keep the arm.
187
                    }
188
                }
189
            }
190
        }
191

192
        Table {
193
            tb_name: table_name,
194
            columns: table_cols,
195
            rows: table_rows,
196
            secondary_indexes,
197
            last_rowid: 0,
198
            primary_key,
199
        }
200
    }
201

202
    /// Deep-clones a `Table` for transaction snapshots (Phase 4f).
203
    ///
204
    /// The normal `Clone` derive would shallow-clone the `Arc<Mutex<_>>`
205
    /// wrapping our row storage, leaving both copies sharing the same
206
    /// inner map — mutating the snapshot would corrupt the live table
207
    /// and vice versa. Instead we lock, clone the inner `HashMap`, and
208
    /// wrap it in a fresh `Arc<Mutex<_>>`. Columns and indexes derive
209
    /// `Clone` directly (all their fields are plain data).
210
    pub fn deep_clone(&self) -> Self {
1✔
211
        let cloned_rows: HashMap<String, Row> = {
1✔
212
            let guard = self.rows.lock().expect("row mutex poisoned");
1✔
213
            guard.clone()
2✔
214
        };
215
        Table {
216
            tb_name: self.tb_name.clone(),
1✔
217
            columns: self.columns.clone(),
1✔
218
            rows: Arc::new(Mutex::new(cloned_rows)),
2✔
219
            secondary_indexes: self.secondary_indexes.clone(),
1✔
220
            last_rowid: self.last_rowid,
1✔
221
            primary_key: self.primary_key.clone(),
1✔
222
        }
223
    }
224

225
    /// Finds an auto- or explicit-index entry for a given column. Returns
226
    /// `None` if the column isn't indexed.
227
    pub fn index_for_column(&self, column: &str) -> Option<&SecondaryIndex> {
1✔
228
        self.secondary_indexes
1✔
229
            .iter()
230
            .find(|i| i.column_name == column)
3✔
231
    }
232

233
    fn index_for_column_mut(&mut self, column: &str) -> Option<&mut SecondaryIndex> {
2✔
234
        self.secondary_indexes
2✔
235
            .iter_mut()
236
            .find(|i| i.column_name == column)
6✔
237
    }
238

239
    /// Finds a secondary index by its own name (e.g., `sqlrite_autoindex_users_email`
240
    /// or a user-provided CREATE INDEX name). Used by Phase 3e.2 to look up
241
    /// explicit indexes when DROP INDEX lands.
242
    #[allow(dead_code)]
243
    pub fn index_by_name(&self, name: &str) -> Option<&SecondaryIndex> {
1✔
244
        self.secondary_indexes.iter().find(|i| i.name == name)
3✔
245
    }
246

247
    /// Returns a `bool` informing if a `Column` with a specific name exists or not
248
    ///
249
    pub fn contains_column(&self, column: String) -> bool {
2✔
250
        self.columns.iter().any(|col| col.column_name == column)
8✔
251
    }
252

253
    /// Returns the list of column names in declaration order.
254
    pub fn column_names(&self) -> Vec<String> {
1✔
255
        self.columns.iter().map(|c| c.column_name.clone()).collect()
3✔
256
    }
257

258
    /// Returns all rowids currently stored in the table, in ascending order.
259
    /// Every column's BTreeMap has the same keyset, so we just read from the first column.
260
    pub fn rowids(&self) -> Vec<i64> {
2✔
261
        let Some(first) = self.columns.first() else {
2✔
262
            return vec![];
×
263
        };
264
        let rows = self.rows.lock().expect("rows mutex poisoned");
2✔
265
        rows.get(&first.column_name)
4✔
266
            .map(|r| r.rowids())
6✔
267
            .unwrap_or_default()
268
    }
269

270
    /// Reads a single cell at `(column, rowid)`.
271
    pub fn get_value(&self, column: &str, rowid: i64) -> Option<Value> {
2✔
272
        let rows = self.rows.lock().expect("rows mutex poisoned");
2✔
273
        rows.get(column).and_then(|r| r.get(rowid))
8✔
274
    }
275

276
    /// Removes the row identified by `rowid` from every column's storage and
277
    /// from every secondary index entry.
278
    pub fn delete_row(&mut self, rowid: i64) {
1✔
279
        // Snapshot the values we're about to delete so we can strip them
280
        // from secondary indexes by (value, rowid) before the row storage
281
        // disappears.
282
        let per_column_values: Vec<(String, Option<Value>)> = self
2✔
283
            .columns
×
284
            .iter()
285
            .map(|c| (c.column_name.clone(), self.get_value(&c.column_name, rowid)))
3✔
286
            .collect();
287

288
        // Remove from row storage.
289
        {
290
            let rows_clone = Arc::clone(&self.rows);
2✔
291
            let mut row_data = rows_clone.lock().expect("rows mutex poisoned");
2✔
292
            for col in &self.columns {
2✔
293
                if let Some(r) = row_data.get_mut(&col.column_name) {
2✔
294
                    match r {
1✔
295
                        Row::Integer(m) => {
1✔
296
                            m.remove(&rowid);
2✔
297
                        }
298
                        Row::Text(m) => {
1✔
299
                            m.remove(&rowid);
2✔
300
                        }
301
                        Row::Real(m) => {
×
302
                            m.remove(&rowid);
×
303
                        }
304
                        Row::Bool(m) => {
×
305
                            m.remove(&rowid);
×
306
                        }
NEW
307
                        Row::Vector(m) => {
×
NEW
308
                            m.remove(&rowid);
×
309
                        }
UNCOV
310
                        Row::None => {}
×
311
                    }
312
                }
313
            }
314
        }
315

316
        // Strip secondary-index entries. Non-indexed columns just don't
317
        // show up in secondary_indexes and are no-ops here.
318
        for (col_name, value) in per_column_values {
2✔
319
            if let Some(idx) = self.index_for_column_mut(&col_name) {
2✔
320
                if let Some(v) = value {
2✔
321
                    idx.remove(&v, rowid);
1✔
322
                }
323
            }
324
        }
325
    }
326

327
    /// Replays a single row at `rowid` when loading a table from disk. Takes
328
    /// one typed value per column (in declaration order); `None` means the
329
    /// stored cell carried a NULL for that column. Unlike `insert_row` this
330
    /// trusts the on-disk state and does *not* re-check UNIQUE — we're
331
    /// rebuilding a state that was already consistent when it was saved.
332
    pub fn restore_row(&mut self, rowid: i64, values: Vec<Option<Value>>) -> Result<()> {
2✔
333
        if values.len() != self.columns.len() {
4✔
334
            return Err(SQLRiteError::Internal(format!(
×
335
                "cell has {} values but table '{}' has {} columns",
×
336
                values.len(),
×
337
                self.tb_name,
×
338
                self.columns.len()
×
339
            )));
340
        }
341

342
        let column_names: Vec<String> =
8✔
343
            self.columns.iter().map(|c| c.column_name.clone()).collect();
×
344

345
        for (i, value) in values.into_iter().enumerate() {
6✔
346
            let col_name = &column_names[i];
4✔
347

348
            // Write into the per-column row storage first (scoped borrow so
349
            // the secondary-index update below doesn't fight over `self`).
350
            {
351
                let rows_clone = Arc::clone(&self.rows);
2✔
352
                let mut row_data = rows_clone.lock().expect("rows mutex poisoned");
4✔
353
                let cell = row_data.get_mut(col_name).ok_or_else(|| {
4✔
354
                    SQLRiteError::Internal(format!("Row storage missing for column '{col_name}'"))
×
355
                })?;
356

357
                match (cell, &value) {
7✔
358
                    (Row::Integer(map), Some(Value::Integer(v))) => {
2✔
359
                        map.insert(rowid, *v as i32);
2✔
360
                    }
361
                    (Row::Integer(_), None) => {
×
362
                        return Err(SQLRiteError::Internal(format!(
×
363
                            "Integer column '{col_name}' cannot store NULL — corrupt cell?"
×
364
                        )));
365
                    }
366
                    (Row::Text(map), Some(Value::Text(s))) => {
2✔
367
                        map.insert(rowid, s.clone());
2✔
368
                    }
369
                    (Row::Text(map), None) => {
×
370
                        // Matches the on-insert convention: NULL in Text
371
                        // storage is represented by the literal "Null"
372
                        // sentinel and not added to the index.
373
                        map.insert(rowid, "Null".to_string());
×
374
                    }
375
                    (Row::Real(map), Some(Value::Real(v))) => {
×
376
                        map.insert(rowid, *v as f32);
×
377
                    }
378
                    (Row::Real(_), None) => {
×
379
                        return Err(SQLRiteError::Internal(format!(
×
380
                            "Real column '{col_name}' cannot store NULL — corrupt cell?"
×
381
                        )));
382
                    }
383
                    (Row::Bool(map), Some(Value::Bool(v))) => {
×
384
                        map.insert(rowid, *v);
×
385
                    }
386
                    (Row::Bool(_), None) => {
×
387
                        return Err(SQLRiteError::Internal(format!(
×
388
                            "Bool column '{col_name}' cannot store NULL — corrupt cell?"
×
389
                        )));
390
                    }
391
                    (Row::Vector(map), Some(Value::Vector(v))) => {
1✔
392
                        map.insert(rowid, v.clone());
1✔
393
                    }
NEW
394
                    (Row::Vector(_), None) => {
×
NEW
395
                        return Err(SQLRiteError::Internal(format!(
×
NEW
396
                            "Vector column '{col_name}' cannot store NULL — corrupt cell?"
×
397
                        )));
398
                    }
399
                    (row, v) => {
×
400
                        return Err(SQLRiteError::Internal(format!(
×
401
                            "Type mismatch restoring column '{col_name}': storage {row:?} vs value {v:?}"
×
402
                        )));
403
                    }
404
                }
405
            }
406

407
            // Maintain the secondary index (if any). NULL values are skipped
408
            // by `insert`, matching the "NULL is not indexed" convention.
409
            if let Some(v) = &value {
2✔
410
                if let Some(idx) = self.index_for_column_mut(col_name) {
4✔
411
                    idx.insert(v, rowid)?;
2✔
412
                }
413
            }
414
        }
415

416
        if rowid > self.last_rowid {
4✔
417
            self.last_rowid = rowid;
2✔
418
        }
419
        Ok(())
2✔
420
    }
421

422
    /// Extracts a row as an ordered `Vec<Option<Value>>` matching the column
423
    /// declaration order. Returns `None` entries for columns that hold NULL.
424
    /// Used by `save_database` to turn a table's in-memory state into cells.
425
    pub fn extract_row(&self, rowid: i64) -> Vec<Option<Value>> {
2✔
426
        self.columns
2✔
427
            .iter()
428
            .map(|c| match self.get_value(&c.column_name, rowid) {
6✔
429
                Some(Value::Null) => None,
×
430
                Some(v) => Some(v),
2✔
431
                None => None,
×
432
            })
433
            .collect()
434
    }
435

436
    /// Overwrites the cell at `(column, rowid)` with `new_val`. Enforces the
437
    /// column's datatype and UNIQUE constraint, and updates any secondary
438
    /// index.
439
    ///
440
    /// Returns `Err` if the column doesn't exist, the value type is incompatible,
441
    /// or writing would violate UNIQUE.
442
    pub fn set_value(&mut self, column: &str, rowid: i64, new_val: Value) -> Result<()> {
1✔
443
        let col_index = self
3✔
444
            .columns
×
445
            .iter()
1✔
446
            .position(|c| c.column_name == column)
3✔
447
            .ok_or_else(|| SQLRiteError::General(format!("Column '{column}' not found")))?;
1✔
448

449
        // No-op write — keep storage exactly the same.
450
        let current = self.get_value(column, rowid);
1✔
451
        if current.as_ref() == Some(&new_val) {
2✔
452
            return Ok(());
×
453
        }
454

455
        // Enforce UNIQUE. Prefer an O(log N) index probe if we have one;
456
        // fall back to a full column scan otherwise (Real/Bool UNIQUE
457
        // columns, which don't get auto-indexed).
458
        if self.columns[col_index].is_unique && !matches!(new_val, Value::Null) {
3✔
459
            if let Some(idx) = self.index_for_column(column) {
1✔
460
                for other in idx.lookup(&new_val) {
3✔
461
                    if other != rowid {
1✔
462
                        return Err(SQLRiteError::General(format!(
1✔
463
                            "UNIQUE constraint violated for column '{column}'"
×
464
                        )));
465
                    }
466
                }
467
            } else {
468
                for other in self.rowids() {
×
469
                    if other == rowid {
×
470
                        continue;
×
471
                    }
472
                    if self.get_value(column, other).as_ref() == Some(&new_val) {
×
473
                        return Err(SQLRiteError::General(format!(
×
474
                            "UNIQUE constraint violated for column '{column}'"
×
475
                        )));
476
                    }
477
                }
478
            }
479
        }
480

481
        // Drop the old index entry before writing the new value, so the
482
        // post-write index insert doesn't clash with the previous state.
483
        if let Some(old) = current {
2✔
484
            if let Some(idx) = self.index_for_column_mut(column) {
2✔
485
                idx.remove(&old, rowid);
×
486
            }
487
        }
488

489
        // Write into the column's Row, type-checking against the declared DataType.
490
        let declared = &self.columns[col_index].datatype;
2✔
491
        {
492
            let rows_clone = Arc::clone(&self.rows);
1✔
493
            let mut row_data = rows_clone.lock().expect("rows mutex poisoned");
2✔
494
            let cell = row_data.get_mut(column).ok_or_else(|| {
2✔
495
                SQLRiteError::Internal(format!("Row storage missing for column '{column}'"))
×
496
            })?;
497

498
            match (cell, &new_val, declared) {
1✔
499
                (Row::Integer(m), Value::Integer(v), _) => {
1✔
500
                    m.insert(rowid, *v as i32);
1✔
501
                }
502
                (Row::Real(m), Value::Real(v), _) => {
×
503
                    m.insert(rowid, *v as f32);
×
504
                }
505
                (Row::Real(m), Value::Integer(v), _) => {
×
506
                    m.insert(rowid, *v as f32);
×
507
                }
508
                (Row::Text(m), Value::Text(v), _) => {
×
509
                    m.insert(rowid, v.clone());
×
510
                }
511
                (Row::Bool(m), Value::Bool(v), _) => {
×
512
                    m.insert(rowid, *v);
×
513
                }
NEW
514
                (Row::Vector(m), Value::Vector(v), DataType::Vector(declared_dim)) => {
×
NEW
515
                    if v.len() != *declared_dim {
×
NEW
516
                        return Err(SQLRiteError::General(format!(
×
NEW
517
                            "Vector dimension mismatch for column '{column}': declared {declared_dim}, got {}",
×
NEW
518
                            v.len()
×
519
                        )));
520
                    }
NEW
521
                    m.insert(rowid, v.clone());
×
522
                }
523
                // NULL writes: store the sentinel "Null" string for Text; for other
524
                // types we leave storage as-is since those BTreeMaps can't hold NULL today.
525
                (Row::Text(m), Value::Null, _) => {
×
526
                    m.insert(rowid, "Null".to_string());
×
527
                }
528
                (_, new, dt) => {
×
529
                    return Err(SQLRiteError::General(format!(
×
530
                        "Type mismatch: cannot assign {} to column '{column}' of type {dt}",
×
531
                        new.to_display_string()
×
532
                    )));
533
                }
534
            }
535
        }
536

537
        // Maintain the secondary index, if any. NULL values are skipped by
538
        // insert per convention.
539
        if !matches!(new_val, Value::Null) {
1✔
540
            if let Some(idx) = self.index_for_column_mut(column) {
2✔
541
                idx.insert(&new_val, rowid)?;
×
542
            }
543
        }
544

545
        Ok(())
1✔
546
    }
547

548
    /// Returns an immutable reference of `sql::db::table::Column` if the table contains a
549
    /// column with the specified key as a column name.
550
    ///
551
    #[allow(dead_code)]
552
    pub fn get_column(&mut self, column_name: String) -> Result<&Column> {
×
553
        if let Some(column) = self
×
554
            .columns
×
555
            .iter()
556
            .filter(|c| c.column_name == column_name)
×
557
            .collect::<Vec<&Column>>()
558
            .first()
559
        {
560
            Ok(column)
×
561
        } else {
562
            Err(SQLRiteError::General(String::from("Column not found.")))
×
563
        }
564
    }
565

566
    /// Validates if columns and values being inserted violate the UNIQUE constraint.
567
    /// PRIMARY KEY columns are automatically UNIQUE. Uses the corresponding
568
    /// secondary index when one exists (O(log N) lookup); falls back to a
569
    /// linear scan for indexable-but-not-indexed situations (e.g. a Real
570
    /// UNIQUE column — Real isn't in the auto-indexed set).
571
    pub fn validate_unique_constraint(
2✔
572
        &mut self,
573
        cols: &Vec<String>,
574
        values: &Vec<String>,
575
    ) -> Result<()> {
576
        for (idx, name) in cols.iter().enumerate() {
4✔
577
            let column = self
4✔
578
                .columns
×
579
                .iter()
2✔
580
                .find(|c| &c.column_name == name)
6✔
581
                .ok_or_else(|| SQLRiteError::General(format!("Column '{name}' not found")))?;
2✔
582
            if !column.is_unique {
2✔
583
                continue;
×
584
            }
585
            let datatype = &column.datatype;
1✔
586
            let val = &values[idx];
1✔
587

588
            // Parse the string value into a runtime Value according to the
589
            // declared column type. If parsing fails the caller's insert
590
            // would also fail with the same error; surface it here so we
591
            // don't emit a misleading "unique OK" on bad input.
592
            let parsed = match datatype {
1✔
593
                DataType::Integer => val.parse::<i64>().map(Value::Integer).map_err(|_| {
1✔
594
                    SQLRiteError::General(format!(
×
595
                        "Type mismatch: expected INTEGER for column '{name}', got '{val}'"
×
596
                    ))
597
                })?,
598
                DataType::Text => Value::Text(val.clone()),
1✔
599
                DataType::Real => val.parse::<f64>().map(Value::Real).map_err(|_| {
×
600
                    SQLRiteError::General(format!(
×
601
                        "Type mismatch: expected REAL for column '{name}', got '{val}'"
×
602
                    ))
603
                })?,
604
                DataType::Bool => val.parse::<bool>().map(Value::Bool).map_err(|_| {
×
605
                    SQLRiteError::General(format!(
×
606
                        "Type mismatch: expected BOOL for column '{name}', got '{val}'"
×
607
                    ))
608
                })?,
NEW
609
                DataType::Vector(declared_dim) => {
×
NEW
610
                    let parsed_vec = parse_vector_literal(val).map_err(|e| {
×
NEW
611
                        SQLRiteError::General(format!(
×
NEW
612
                            "Type mismatch: expected VECTOR({declared_dim}) for column '{name}', {e}"
×
613
                        ))
614
                    })?;
NEW
615
                    if parsed_vec.len() != *declared_dim {
×
NEW
616
                        return Err(SQLRiteError::General(format!(
×
NEW
617
                            "Vector dimension mismatch for column '{name}': declared {declared_dim}, got {}",
×
NEW
618
                            parsed_vec.len()
×
619
                        )));
620
                    }
NEW
621
                    Value::Vector(parsed_vec)
×
622
                }
623
                DataType::None | DataType::Invalid => {
×
624
                    return Err(SQLRiteError::Internal(format!(
×
625
                        "column '{name}' has an unsupported datatype"
×
626
                    )));
627
                }
628
            };
629

630
            if let Some(secondary) = self.index_for_column(name) {
2✔
631
                if secondary.would_violate_unique(&parsed) {
2✔
632
                    return Err(SQLRiteError::General(format!(
×
633
                        "UNIQUE constraint violated for column '{name}': value '{val}' already exists"
×
634
                    )));
635
                }
636
            } else {
637
                // No secondary index (Real / Bool UNIQUE). Linear scan.
638
                for other in self.rowids() {
×
639
                    if self.get_value(name, other).as_ref() == Some(&parsed) {
×
640
                        return Err(SQLRiteError::General(format!(
×
641
                            "UNIQUE constraint violated for column '{name}': value '{val}' already exists"
×
642
                        )));
643
                    }
644
                }
645
            }
646
        }
647
        Ok(())
2✔
648
    }
649

650
    /// Inserts all VALUES in its approprieta COLUMNS, using the ROWID an embedded INDEX on all ROWS
651
    /// Every `Table` keeps track of the `last_rowid` in order to facilitate what the next one would be.
652
    /// One limitation of this data structure is that we can only have one write transaction at a time, otherwise
653
    /// we could have a race condition on the last_rowid.
654
    ///
655
    /// Since we are loosely modeling after SQLite, this is also a limitation of SQLite (allowing only one write transcation at a time),
656
    /// So we are good. :)
657
    ///
658
    /// Returns `Err` (leaving the table unchanged) when the user supplies an
659
    /// incompatibly-typed value — no more panics on bad input.
660
    pub fn insert_row(&mut self, cols: &Vec<String>, values: &Vec<String>) -> Result<()> {
2✔
661
        let mut next_rowid = self.last_rowid + 1;
2✔
662

663
        // Auto-assign INTEGER PRIMARY KEY when the user omits it; otherwise
664
        // adopt the supplied value as the new rowid.
665
        if self.primary_key != "-1" {
2✔
666
            if !cols.iter().any(|col| col == &self.primary_key) {
6✔
667
                // Write the auto-assigned PK into row storage, then sync
668
                // the secondary index.
669
                let val = next_rowid as i32;
2✔
670
                let wrote_integer = {
×
671
                    let rows_clone = Arc::clone(&self.rows);
2✔
672
                    let mut row_data = rows_clone.lock().expect("rows mutex poisoned");
4✔
673
                    let table_col_data = row_data.get_mut(&self.primary_key).ok_or_else(|| {
4✔
674
                        SQLRiteError::Internal(format!(
×
675
                            "Row storage missing for primary key column '{}'",
×
676
                            self.primary_key
×
677
                        ))
678
                    })?;
679
                    match table_col_data {
2✔
680
                        Row::Integer(tree) => {
2✔
681
                            tree.insert(next_rowid, val);
2✔
682
                            true
2✔
683
                        }
684
                        _ => false, // non-integer PK: auto-assign is a no-op
×
685
                    }
686
                };
687
                if wrote_integer {
2✔
688
                    let pk = self.primary_key.clone();
2✔
689
                    if let Some(idx) = self.index_for_column_mut(&pk) {
4✔
690
                        idx.insert(&Value::Integer(val as i64), next_rowid)?;
2✔
691
                    }
692
                }
693
            } else {
694
                for i in 0..cols.len() {
2✔
695
                    if cols[i] == self.primary_key {
2✔
696
                        let val = &values[i];
1✔
697
                        next_rowid = val.parse::<i64>().map_err(|_| {
1✔
698
                            SQLRiteError::General(format!(
×
699
                                "Type mismatch: PRIMARY KEY column '{}' expects INTEGER, got '{val}'",
×
700
                                self.primary_key
×
701
                            ))
702
                        })?;
703
                    }
704
                }
705
            }
706
        }
707

708
        // For every table column, either pick the supplied value or pad with NULL
709
        // so that every column's BTreeMap keeps the same rowid keyset.
710
        let column_names = self
2✔
711
            .columns
×
712
            .iter()
713
            .map(|col| col.column_name.to_string())
6✔
714
            .collect::<Vec<String>>();
715
        let mut j: usize = 0;
2✔
716
        for i in 0..column_names.len() {
4✔
717
            let mut val = String::from("Null");
2✔
718
            let key = &column_names[i];
4✔
719

720
            if let Some(supplied_key) = cols.get(j) {
2✔
721
                if supplied_key == &column_names[i] {
6✔
722
                    val = values[j].to_string();
4✔
723
                    j += 1;
2✔
724
                } else if self.primary_key == column_names[i] {
4✔
725
                    // PK already stored in the auto-assign branch above.
726
                    continue;
×
727
                }
728
            } else if self.primary_key == column_names[i] {
2✔
729
                continue;
×
730
            }
731

732
            // Step 1: write into row storage and compute the typed Value
733
            // we'll hand to the secondary index (if any).
734
            let typed_value: Option<Value> = {
×
735
                let rows_clone = Arc::clone(&self.rows);
4✔
736
                let mut row_data = rows_clone.lock().expect("rows mutex poisoned");
4✔
737
                let table_col_data = row_data.get_mut(key).ok_or_else(|| {
4✔
738
                    SQLRiteError::Internal(format!("Row storage missing for column '{key}'"))
×
739
                })?;
740

741
                match table_col_data {
2✔
742
                    Row::Integer(tree) => {
1✔
743
                        let parsed = val.parse::<i32>().map_err(|_| {
5✔
744
                            SQLRiteError::General(format!(
1✔
745
                                "Type mismatch: expected INTEGER for column '{key}', got '{val}'"
×
746
                            ))
747
                        })?;
748
                        tree.insert(next_rowid, parsed);
1✔
749
                        Some(Value::Integer(parsed as i64))
1✔
750
                    }
751
                    Row::Text(tree) => {
2✔
752
                        tree.insert(next_rowid, val.to_string());
4✔
753
                        // "Null" sentinel stays out of the index — it isn't a
754
                        // real user value.
755
                        if val != "Null" {
5✔
756
                            Some(Value::Text(val.to_string()))
2✔
757
                        } else {
758
                            None
1✔
759
                        }
760
                    }
761
                    Row::Real(tree) => {
×
762
                        let parsed = val.parse::<f32>().map_err(|_| {
×
763
                            SQLRiteError::General(format!(
×
764
                                "Type mismatch: expected REAL for column '{key}', got '{val}'"
×
765
                            ))
766
                        })?;
767
                        tree.insert(next_rowid, parsed);
×
768
                        Some(Value::Real(parsed as f64))
×
769
                    }
770
                    Row::Bool(tree) => {
×
771
                        let parsed = val.parse::<bool>().map_err(|_| {
×
772
                            SQLRiteError::General(format!(
×
773
                                "Type mismatch: expected BOOL for column '{key}', got '{val}'"
×
774
                            ))
775
                        })?;
776
                        tree.insert(next_rowid, parsed);
×
777
                        Some(Value::Bool(parsed))
×
778
                    }
779
                    Row::Vector(tree) => {
1✔
780
                        // The parser put a bracket-array literal into `val`
781
                        // (e.g. "[0.1,0.2,0.3]"). Parse it back here and
782
                        // dim-check against the column's declared
783
                        // DataType::Vector(N).
784
                        let parsed = parse_vector_literal(&val).map_err(|e| {
2✔
NEW
785
                            SQLRiteError::General(format!(
×
NEW
786
                                "Type mismatch: expected VECTOR for column '{key}', {e}"
×
787
                            ))
788
                        })?;
789
                        let declared_dim = match &self.columns[i].datatype {
2✔
790
                            DataType::Vector(d) => *d,
1✔
NEW
791
                            other => {
×
NEW
792
                                return Err(SQLRiteError::Internal(format!(
×
NEW
793
                                    "Row::Vector storage on non-Vector column '{key}' (declared as {other})"
×
794
                                )));
795
                            }
796
                        };
797
                        if parsed.len() != declared_dim {
2✔
798
                            return Err(SQLRiteError::General(format!(
1✔
NEW
799
                                "Vector dimension mismatch for column '{key}': declared {declared_dim}, got {}",
×
800
                                parsed.len()
2✔
801
                            )));
802
                        }
803
                        tree.insert(next_rowid, parsed.clone());
2✔
804
                        Some(Value::Vector(parsed))
1✔
805
                    }
806
                    Row::None => {
×
807
                        return Err(SQLRiteError::Internal(format!(
×
808
                            "Column '{key}' has no row storage"
×
809
                        )));
810
                    }
811
                }
812
            };
813

814
            // Step 2: maintain the secondary index (if any). insert() is a
815
            // no-op for Value::Null and cheap for other value kinds.
816
            if let Some(v) = typed_value {
2✔
817
                if let Some(idx) = self.index_for_column_mut(key) {
4✔
818
                    idx.insert(&v, next_rowid)?;
2✔
819
                }
820
            }
821
        }
822
        self.last_rowid = next_rowid;
2✔
823
        Ok(())
2✔
824
    }
825

826
    /// Print the table schema to standard output in a pretty formatted way.
827
    ///
828
    /// # Example
829
    ///
830
    /// ```text
831
    /// let table = Table::new(payload);
832
    /// table.print_table_schema();
833
    ///
834
    /// Prints to standard output:
835
    ///    +-------------+-----------+-------------+--------+----------+
836
    ///    | Column Name | Data Type | PRIMARY KEY | UNIQUE | NOT NULL |
837
    ///    +-------------+-----------+-------------+--------+----------+
838
    ///    | id          | Integer   | true        | true   | true     |
839
    ///    +-------------+-----------+-------------+--------+----------+
840
    ///    | name        | Text      | false       | true   | false    |
841
    ///    +-------------+-----------+-------------+--------+----------+
842
    ///    | email       | Text      | false       | false  | false    |
843
    ///    +-------------+-----------+-------------+--------+----------+
844
    /// ```
845
    ///
846
    pub fn print_table_schema(&self) -> Result<usize> {
3✔
847
        let mut table = PrintTable::new();
2✔
848
        table.add_row(row![
6✔
849
            "Column Name",
×
850
            "Data Type",
×
851
            "PRIMARY KEY",
×
852
            "UNIQUE",
×
853
            "NOT NULL"
×
854
        ]);
855

856
        for col in &self.columns {
2✔
857
            table.add_row(row![
14✔
858
                col.column_name,
2✔
UNCOV
859
                col.datatype,
×
860
                col.is_pk,
2✔
861
                col.is_unique,
2✔
862
                col.not_null
2✔
863
            ]);
864
        }
865

866
        table.printstd();
2✔
867
        Ok(table.len() * 2 + 1)
2✔
868
    }
869

870
    /// Print the table data to standard output in a pretty formatted way.
871
    ///
872
    /// # Example
873
    ///
874
    /// ```text
875
    /// let db_table = db.get_table_mut(table_name.to_string()).unwrap();
876
    /// db_table.print_table_data();
877
    ///
878
    /// Prints to standard output:
879
    ///     +----+---------+------------------------+
880
    ///     | id | name    | email                  |
881
    ///     +----+---------+------------------------+
882
    ///     | 1  | "Jack"  | "jack@mail.com"        |
883
    ///     +----+---------+------------------------+
884
    ///     | 10 | "Bob"   | "bob@main.com"         |
885
    ///     +----+---------+------------------------+
886
    ///     | 11 | "Bill"  | "bill@main.com"        |
887
    ///     +----+---------+------------------------+
888
    /// ```
889
    ///
890
    pub fn print_table_data(&self) {
2✔
891
        let mut print_table = PrintTable::new();
2✔
892

893
        let column_names = self
2✔
894
            .columns
×
895
            .iter()
896
            .map(|col| col.column_name.to_string())
6✔
897
            .collect::<Vec<String>>();
898

899
        let header_row = PrintRow::new(
900
            column_names
2✔
901
                .iter()
2✔
902
                .map(|col| PrintCell::new(col))
6✔
903
                .collect::<Vec<PrintCell>>(),
2✔
904
        );
905

906
        let rows_clone = Arc::clone(&self.rows);
4✔
907
        let row_data = rows_clone.lock().expect("rows mutex poisoned");
4✔
908
        let first_col_data = row_data
4✔
909
            .get(&self.columns.first().unwrap().column_name)
2✔
910
            .unwrap();
911
        let num_rows = first_col_data.count();
2✔
912
        let mut print_table_rows: Vec<PrintRow> = vec![PrintRow::new(vec![]); num_rows];
2✔
913

914
        for col_name in &column_names {
4✔
915
            let col_val = row_data
4✔
916
                .get(col_name)
2✔
917
                .expect("Can't find any rows with the given column");
918
            let columns: Vec<String> = col_val.get_serialized_col_data();
2✔
919

920
            for i in 0..num_rows {
4✔
921
                if let Some(cell) = &columns.get(i) {
4✔
922
                    print_table_rows[i].add_cell(PrintCell::new(cell));
4✔
923
                } else {
924
                    print_table_rows[i].add_cell(PrintCell::new(""));
×
925
                }
926
            }
927
        }
928

929
        print_table.add_row(header_row);
2✔
930
        for row in print_table_rows {
4✔
931
            print_table.add_row(row);
4✔
932
        }
933

934
        print_table.printstd();
2✔
935
    }
936
}
937

938
/// The schema for each SQL column in every table.
939
///
940
/// Per-column index state moved to `Table::secondary_indexes` in Phase 3e —
941
/// a single `Column` describes the declared schema (name, type, constraints)
942
/// and nothing more.
943
#[derive(PartialEq, Debug, Clone)]
944
pub struct Column {
945
    pub column_name: String,
946
    pub datatype: DataType,
947
    pub is_pk: bool,
948
    pub not_null: bool,
949
    pub is_unique: bool,
950
}
951

952
impl Column {
953
    pub fn new(
2✔
954
        name: String,
955
        datatype: String,
956
        is_pk: bool,
957
        not_null: bool,
958
        is_unique: bool,
959
    ) -> Self {
960
        let dt = DataType::new(datatype);
4✔
961
        Column {
962
            column_name: name,
963
            datatype: dt,
964
            is_pk,
965
            not_null,
966
            is_unique,
967
        }
968
    }
969
}
970

971
/// The schema for each SQL row in every table is represented in memory
972
/// by following structure
973
///
974
/// This is an enum representing each of the available types organized in a BTreeMap
975
/// data structure, using the ROWID and key and each corresponding type as value
976
#[derive(PartialEq, Debug, Clone)]
977
pub enum Row {
978
    Integer(BTreeMap<i64, i32>),
979
    Text(BTreeMap<i64, String>),
980
    Real(BTreeMap<i64, f32>),
981
    Bool(BTreeMap<i64, bool>),
982
    /// Phase 7a: dense f32 vector storage. Each `Vec<f32>` should have
983
    /// length matching the column's declared `DataType::Vector(dim)`,
984
    /// enforced at INSERT time. The Row variant doesn't carry the dim —
985
    /// it lives in the column metadata.
986
    Vector(BTreeMap<i64, Vec<f32>>),
987
    None,
988
}
989

990
impl Row {
991
    fn get_serialized_col_data(&self) -> Vec<String> {
2✔
992
        match self {
2✔
993
            Row::Integer(cd) => cd.values().map(|v| v.to_string()).collect(),
6✔
994
            Row::Real(cd) => cd.values().map(|v| v.to_string()).collect(),
×
995
            Row::Text(cd) => cd.values().map(|v| v.to_string()).collect(),
6✔
996
            Row::Bool(cd) => cd.values().map(|v| v.to_string()).collect(),
×
997
            Row::Vector(cd) => cd.values().map(format_vector_for_display).collect(),
1✔
UNCOV
998
            Row::None => panic!("Found None in columns"),
×
999
        }
1000
    }
1001

1002
    fn count(&self) -> usize {
2✔
1003
        match self {
2✔
1004
            Row::Integer(cd) => cd.len(),
2✔
1005
            Row::Real(cd) => cd.len(),
×
1006
            Row::Text(cd) => cd.len(),
1✔
1007
            Row::Bool(cd) => cd.len(),
×
NEW
1008
            Row::Vector(cd) => cd.len(),
×
UNCOV
1009
            Row::None => panic!("Found None in columns"),
×
1010
        }
1011
    }
1012

1013
    /// Every column's BTreeMap is keyed by ROWID. All columns share the same keyset
1014
    /// after an INSERT (missing columns are padded), so any column's keys are a valid
1015
    /// iteration of the table's rowids.
1016
    pub fn rowids(&self) -> Vec<i64> {
2✔
1017
        match self {
2✔
1018
            Row::Integer(m) => m.keys().copied().collect(),
2✔
1019
            Row::Text(m) => m.keys().copied().collect(),
2✔
1020
            Row::Real(m) => m.keys().copied().collect(),
×
1021
            Row::Bool(m) => m.keys().copied().collect(),
×
NEW
1022
            Row::Vector(m) => m.keys().copied().collect(),
×
UNCOV
1023
            Row::None => vec![],
×
1024
        }
1025
    }
1026

1027
    pub fn get(&self, rowid: i64) -> Option<Value> {
2✔
1028
        match self {
2✔
1029
            Row::Integer(m) => m.get(&rowid).map(|v| Value::Integer(i64::from(*v))),
6✔
1030
            // INSERT stores the literal string "Null" in Text columns that were omitted
1031
            // from the query — re-map that back to a real NULL on read.
1032
            Row::Text(m) => m.get(&rowid).map(|v| {
4✔
1033
                if v == "Null" {
4✔
1034
                    Value::Null
1✔
1035
                } else {
1036
                    Value::Text(v.clone())
2✔
1037
                }
1038
            }),
1039
            Row::Real(m) => m.get(&rowid).map(|v| Value::Real(f64::from(*v))),
×
1040
            Row::Bool(m) => m.get(&rowid).map(|v| Value::Bool(*v)),
×
1041
            Row::Vector(m) => m.get(&rowid).map(|v| Value::Vector(v.clone())),
3✔
UNCOV
1042
            Row::None => None,
×
1043
        }
1044
    }
1045
}
1046

1047
/// Render a vector for human display. Used by both `Row::get_serialized_col_data`
1048
/// (for the REPL's print-table path) and `Value::to_display_string`.
1049
///
1050
/// Format: `[0.1, 0.2, 0.3]` — JSON-like, decimal-minimal via `{}` Display.
1051
/// For high-dimensional vectors (e.g. 384 elements) this produces a long
1052
/// line; truncation ellipsis is a future polish (see Phase 7 plan, "What
1053
/// this proposal does NOT commit to").
1054
fn format_vector_for_display(v: &Vec<f32>) -> String {
2✔
1055
    let mut s = String::with_capacity(v.len() * 6 + 2);
1✔
1056
    s.push('[');
1✔
1057
    for (i, x) in v.iter().enumerate() {
1✔
1058
        if i > 0 {
1✔
1059
            s.push_str(", ");
1✔
1060
        }
1061
        // Default f32 Display picks the minimal-roundtrip representation,
1062
        // so 0.1f32 prints as "0.1" not "0.10000000149011612". Good enough.
1063
        s.push_str(&x.to_string());
2✔
1064
    }
1065
    s.push(']');
1✔
1066
    s
1✔
1067
}
1068

1069
/// Runtime value produced by query execution. Separate from the on-disk `Row` enum
1070
/// so the executor can carry typed values (including NULL) across operators.
1071
#[derive(Debug, Clone, PartialEq)]
1072
pub enum Value {
1073
    Integer(i64),
1074
    Text(String),
1075
    Real(f64),
1076
    Bool(bool),
1077
    /// Phase 7a: dense f32 vector as a runtime value. Carries its own
1078
    /// dimension implicitly via `Vec::len`; the column it's being
1079
    /// assigned to has a declared `DataType::Vector(N)` that's checked
1080
    /// at INSERT/UPDATE time.
1081
    Vector(Vec<f32>),
1082
    Null,
1083
}
1084

1085
impl Value {
1086
    pub fn to_display_string(&self) -> String {
1✔
1087
        match self {
1✔
1088
            Value::Integer(v) => v.to_string(),
1✔
1089
            Value::Text(s) => s.clone(),
1✔
1090
            Value::Real(f) => f.to_string(),
×
1091
            Value::Bool(b) => b.to_string(),
×
1092
            Value::Vector(v) => format_vector_for_display(v),
1✔
1093
            Value::Null => String::from("NULL"),
1094
        }
1095
    }
1096
}
1097

1098
/// Parse a bracket-array literal like `"[0.1, 0.2, 0.3]"` (or `"[1, 2, 3]"`)
1099
/// into a `Vec<f32>`. The parser/insert pipeline stores vector literals as
1100
/// strings in `InsertQuery::rows` (a `Vec<Vec<String>>`); this helper is
1101
/// the inverse — turn the string back into a typed vector at the boundary
1102
/// where we actually need element-typed data.
1103
///
1104
/// Accepts:
1105
/// - `[]` → empty vector (caller's dimension check rejects it for VECTOR(N≥1))
1106
/// - `[0.1, 0.2, 0.3]` → standard float syntax
1107
/// - `[1, 2, 3]` → integers, coerced to f32 (matches `VALUES (1, 2)` for
1108
///   `REAL` columns; we widen ints to floats automatically)
1109
/// - whitespace tolerated everywhere (Python/JSON/pgvector convention)
1110
///
1111
/// Rejects with a descriptive message:
1112
/// - missing `[` / `]`
1113
/// - non-numeric elements (`['foo', 0.1]`)
1114
/// - NaN / Inf literals (we accept them via `f32::from_str` but caller can
1115
///   reject if undesired — for now we let them through; HNSW etc. will
1116
///   reject NaN at index time)
1117
pub fn parse_vector_literal(s: &str) -> Result<Vec<f32>> {
1✔
1118
    let trimmed = s.trim();
1✔
1119
    if !trimmed.starts_with('[') || !trimmed.ends_with(']') {
2✔
1120
        return Err(SQLRiteError::General(format!(
1✔
1121
            "expected bracket-array literal `[...]`, got `{s}`"
1122
        )));
1123
    }
1124
    let inner = &trimmed[1..trimmed.len() - 1].trim();
2✔
1125
    if inner.is_empty() {
1✔
1126
        return Ok(Vec::new());
1✔
1127
    }
1128
    let mut out = Vec::new();
1✔
1129
    for (i, part) in inner.split(',').enumerate() {
2✔
1130
        let element = part.trim();
2✔
1131
        let parsed: f32 = element.parse().map_err(|_| {
3✔
1132
            SQLRiteError::General(format!("vector element {i} (`{element}`) is not a number"))
1✔
1133
        })?;
1134
        out.push(parsed);
1✔
1135
    }
1136
    Ok(out)
1✔
1137
}
1138

1139
#[cfg(test)]
1140
mod tests {
1141
    use super::*;
1142
    use sqlparser::dialect::SQLiteDialect;
1143
    use sqlparser::parser::Parser;
1144

1145
    #[test]
1146
    fn datatype_display_trait_test() {
3✔
1147
        let integer = DataType::Integer;
1✔
1148
        let text = DataType::Text;
1✔
1149
        let real = DataType::Real;
1✔
1150
        let boolean = DataType::Bool;
1✔
1151
        let vector = DataType::Vector(384);
1✔
1152
        let none = DataType::None;
1✔
1153
        let invalid = DataType::Invalid;
1✔
1154

1155
        assert_eq!(format!("{}", integer), "Integer");
1✔
1156
        assert_eq!(format!("{}", text), "Text");
1✔
1157
        assert_eq!(format!("{}", real), "Real");
1✔
1158
        assert_eq!(format!("{}", boolean), "Boolean");
1✔
1159
        assert_eq!(format!("{}", vector), "Vector(384)");
1✔
1160
        assert_eq!(format!("{}", none), "None");
1✔
1161
        assert_eq!(format!("{}", invalid), "Invalid");
1✔
1162
    }
1163

1164
    // -----------------------------------------------------------------
1165
    // Phase 7a — VECTOR(N) column type
1166
    // -----------------------------------------------------------------
1167

1168
    #[test]
1169
    fn datatype_new_parses_vector_dim() {
3✔
1170
        // Standard cases.
1171
        assert_eq!(DataType::new("vector(1)".to_string()), DataType::Vector(1));
1✔
1172
        assert_eq!(
1✔
1173
            DataType::new("vector(384)".to_string()),
1✔
1174
            DataType::Vector(384)
1175
        );
1176
        assert_eq!(
1✔
1177
            DataType::new("vector(1536)".to_string()),
1✔
1178
            DataType::Vector(1536)
1179
        );
1180

1181
        // Case-insensitive on the keyword.
1182
        assert_eq!(
1✔
1183
            DataType::new("VECTOR(384)".to_string()),
1✔
1184
            DataType::Vector(384)
1185
        );
1186

1187
        // Whitespace inside parens tolerated (the create-parser strips it
1188
        // but the string-based round-trip in DataType::new is the one place
1189
        // we don't fully control input formatting).
1190
        assert_eq!(
1✔
1191
            DataType::new("vector( 64 )".to_string()),
1✔
1192
            DataType::Vector(64)
1193
        );
1194
    }
1195

1196
    #[test]
1197
    fn datatype_new_rejects_bad_vector_strings() {
3✔
1198
        // dim = 0 is rejected (Q2: VECTOR(N≥1)).
1199
        assert_eq!(DataType::new("vector(0)".to_string()), DataType::Invalid);
1✔
1200
        // Non-numeric dim.
1201
        assert_eq!(DataType::new("vector(abc)".to_string()), DataType::Invalid);
1✔
1202
        // Empty parens.
1203
        assert_eq!(DataType::new("vector()".to_string()), DataType::Invalid);
1✔
1204
        // Negative dim wouldn't even parse as usize, so falls into Invalid.
1205
        assert_eq!(DataType::new("vector(-3)".to_string()), DataType::Invalid);
1✔
1206
    }
1207

1208
    #[test]
1209
    fn datatype_to_wire_string_round_trips_vector() {
3✔
1210
        let dt = DataType::Vector(384);
1✔
1211
        let wire = dt.to_wire_string();
1✔
1212
        assert_eq!(wire, "vector(384)");
2✔
1213
        // And feeds back through DataType::new losslessly — this is the
1214
        // round-trip the ParsedColumn pipeline relies on.
1215
        assert_eq!(DataType::new(wire), DataType::Vector(384));
1✔
1216
    }
1217

1218
    #[test]
1219
    fn parse_vector_literal_accepts_floats() {
3✔
1220
        let v = parse_vector_literal("[0.1, 0.2, 0.3]").expect("parse");
1✔
1221
        assert_eq!(v, vec![0.1f32, 0.2, 0.3]);
2✔
1222
    }
1223

1224
    #[test]
1225
    fn parse_vector_literal_accepts_ints_widening_to_f32() {
3✔
1226
        let v = parse_vector_literal("[1, 2, 3]").expect("parse");
1✔
1227
        assert_eq!(v, vec![1.0f32, 2.0, 3.0]);
2✔
1228
    }
1229

1230
    #[test]
1231
    fn parse_vector_literal_handles_negatives_and_whitespace() {
3✔
1232
        let v = parse_vector_literal("[ -1.5 ,  2.0,  -3.5 ]").expect("parse");
1✔
1233
        assert_eq!(v, vec![-1.5f32, 2.0, -3.5]);
2✔
1234
    }
1235

1236
    #[test]
1237
    fn parse_vector_literal_empty_brackets_is_empty_vec() {
3✔
1238
        let v = parse_vector_literal("[]").expect("parse");
1✔
1239
        assert!(v.is_empty());
2✔
1240
    }
1241

1242
    #[test]
1243
    fn parse_vector_literal_rejects_non_bracketed() {
3✔
1244
        assert!(parse_vector_literal("0.1, 0.2").is_err());
1✔
1245
        assert!(parse_vector_literal("(0.1, 0.2)").is_err());
1✔
1246
        assert!(parse_vector_literal("[0.1, 0.2").is_err()); // missing ]
1✔
1247
        assert!(parse_vector_literal("0.1, 0.2]").is_err()); // missing [
1✔
1248
    }
1249

1250
    #[test]
1251
    fn parse_vector_literal_rejects_non_numeric_elements() {
4✔
1252
        let err = parse_vector_literal("[1.0, 'foo', 3.0]").unwrap_err();
1✔
1253
        let msg = format!("{err}");
2✔
NEW
1254
        assert!(
×
1255
            msg.contains("vector element 1") && msg.contains("'foo'"),
3✔
1256
            "error message should pinpoint the bad element: got `{msg}`"
1257
        );
1258
    }
1259

1260
    #[test]
1261
    fn value_vector_display_format() {
3✔
1262
        let v = Value::Vector(vec![0.1, 0.2, 0.3]);
1✔
1263
        assert_eq!(v.to_display_string(), "[0.1, 0.2, 0.3]");
2✔
1264

1265
        // Empty vector displays as `[]`.
1266
        let empty = Value::Vector(vec![]);
1✔
1267
        assert_eq!(empty.to_display_string(), "[]");
2✔
1268
    }
1269

1270
    #[test]
1271
    fn create_new_table_test() {
3✔
1272
        let query_statement = "CREATE TABLE contacts (
1✔
1273
            id INTEGER PRIMARY KEY,
1274
            first_name TEXT NOT NULL,
1275
            last_name TEXT NOT NULl,
1276
            email TEXT NOT NULL UNIQUE,
1277
            active BOOL,
1278
            score REAL
1279
        );";
1280
        let dialect = SQLiteDialect {};
1281
        let mut ast = Parser::parse_sql(&dialect, query_statement).unwrap();
1✔
1282
        if ast.len() > 1 {
2✔
1283
            panic!("Expected a single query statement, but there are more then 1.")
×
1284
        }
1285
        let query = ast.pop().unwrap();
2✔
1286

1287
        let create_query = CreateQuery::new(&query).unwrap();
2✔
1288

1289
        let table = Table::new(create_query);
1✔
1290

1291
        assert_eq!(table.columns.len(), 6);
2✔
1292
        assert_eq!(table.last_rowid, 0);
1✔
1293

1294
        let id_column = "id".to_string();
1✔
1295
        if let Some(column) = table
3✔
1296
            .columns
1297
            .iter()
1298
            .filter(|c| c.column_name == id_column)
3✔
1299
            .collect::<Vec<&Column>>()
1300
            .first()
1301
        {
1302
            assert!(column.is_pk);
1✔
1303
            assert_eq!(column.datatype, DataType::Integer);
1✔
1304
        } else {
1305
            panic!("column not found");
×
1306
        }
1307
    }
1308

1309
    #[test]
1310
    fn print_table_schema_test() {
3✔
1311
        let query_statement = "CREATE TABLE contacts (
1✔
1312
            id INTEGER PRIMARY KEY,
1313
            first_name TEXT NOT NULL,
1314
            last_name TEXT NOT NULl
1315
        );";
1316
        let dialect = SQLiteDialect {};
1317
        let mut ast = Parser::parse_sql(&dialect, query_statement).unwrap();
1✔
1318
        if ast.len() > 1 {
2✔
1319
            panic!("Expected a single query statement, but there are more then 1.")
×
1320
        }
1321
        let query = ast.pop().unwrap();
2✔
1322

1323
        let create_query = CreateQuery::new(&query).unwrap();
2✔
1324

1325
        let table = Table::new(create_query);
1✔
1326
        let lines_printed = table.print_table_schema();
1✔
1327
        assert_eq!(lines_printed, Ok(9));
2✔
1328
    }
1329
}
STATUS · Troubleshooting · Open an Issue · Sales · Support · CAREERS · ENTERPRISE · START FREE · SCHEDULE DEMO
ANNOUNCEMENTS · TWITTER · TOS & SLA · Supported CI Services · What's a CI service? · Automated Testing

© 2026 Coveralls, Inc