| cmd/ebdiskindex | ||
| internal | ||
| .gitignore | ||
| CODEOFCONDUCT.md | ||
| go.mod | ||
| go.sum | ||
| LICENSE.md | ||
| README.md | ||
ebdiskindex
Index files on external drives without keeping them plugged in. Search by filename, find duplicates across disks, and browse inside zip/tar/rar/7z/ISO archives — all from a single SQLite database stored locally.
Requires root. Scanning mounts and unmounts block devices. Run scan commands with
sudo, or as root. The database is stored in the invoking user's home directory even under sudo, so search commands work without root.
Requirements
- Linux kernel 3.10+ (uses
/sys/blockand syscall mount) - Go 1.24+ (to build)
ntfs-3gor kernelntfs3module — for NTFS volumes (optional, auto-detected)
No other external tools required. All archive reading, hashing, device probing, and mounting are handled natively.
Build
git clone https://github.com/anomalyco/ebdiskindex
cd ebdiskindex
go build -o ebdiskindex ./cmd/ebdiskindex
Install system-wide:
sudo install -m 755 ebdiskindex /usr/local/bin/
Or run directly from the repo:
./ebdiskindex help
Database location
The database lives at ~/.local/share/ebdiskindex/index.db.
When running under sudo, the real user's home is resolved via $SUDO_USER, so the database is shared between privileged scan invocations and unprivileged search/dedup invocations. Override the directory with EBDISKINDEX_DB=/path/to/dir.
Quick start
# 1. Inspect the device — prints a ready-to-run disk add command
sudo ebdiskindex disk suggest --device /dev/sdb1
# Output: ebdiskindex disk add --name "MY_DISK" --description "931.5G ntfs" --device "/dev/sdb1"
# 2. Register it (copy the output from step 1 and run it)
ebdiskindex disk add --name "MY_DISK" --description "931.5G ntfs" --device /dev/sdb1
# 3. Fast scan — mounts read-only, walks filesystem, indexes archives
sudo ebdiskindex scan fast --device /dev/sdb1
# 4. Unplug the drive — search works fully offline
ebdiskindex search "vacation photos"
# 5. Deep scan — hashes every file for duplicate detection
sudo ebdiskindex scan deep --device /dev/sdb1
# 6. Find duplicates
ebdiskindex dedup --min-size 1048576
Commands
disk
ebdiskindex disk suggest --device /dev/sdXN
ebdiskindex disk add --name <name> [--description <desc>] [--device /dev/sdXN]
ebdiskindex disk list
ebdiskindex disk edit --id <id> [--name <name>] [--description <desc>]
disk suggest inspects a device and prints a ready-to-run disk add command with the detected label, filesystem type, and size — no flags to look up manually:
$ sudo ebdiskindex disk suggest --device /dev/sdb1
ebdiskindex disk add --name "GIBSONBACKUPS" --description "931.5G ntfs" --device "/dev/sdb1"
disk suggest needs root only to read the block device superblock. disk add/list/edit do not need root.
partition
ebdiskindex partition add --disk-id <id> --device /dev/sdXN
Add additional partitions to an existing disk record. Useful for multi-partition drives.
scan
Requires
sudo/ root. The scan commands mount the device read-only, walk the filesystem, and unmount when done.
sudo ebdiskindex scan fast --device /dev/sdXN [--partition-id <id>] [--keep-mount]
sudo ebdiskindex scan deep --device /dev/sdXN [--partition-id <id>] [--keep-mount]
- fast — mounts the device read-only, walks the directory tree, records filenames/paths/sizes/timestamps, lists members of archives (zip, tar, rar, 7z, ISO, disk images). Wipes and rebuilds from scratch on each run.
- deep — runs a fast scan first (if needed), then hashes every file with xxhash. Required for
dedup. Uses a worker pool: 1 worker for rotational disks, 4 for SSDs. Shows a per-file progress bar during hashing. --keep-mount— leave the device mounted after the scan completes (useful when doing a fast scan immediately followed by a deep scan).--partition-id— required when multiple partitions for the same device path are registered. The error message will list the available partition IDs.
A live TUI shows scan progress:
Scanning 12,453 / 84,201 (14%) 148/s ETA 483s /Music/Albums/...
████████░░░░░░░░░░░░░░░░ 3.2M / 48.1M
search
ebdiskindex search [query]
ebdiskindex search --query <fts5-expr> [--disk <name>] [--type files|archive-member] [--limit <n>]
- Without arguments: opens an interactive search UI (type to filter, arrow keys to navigate).
--query: FTS5 full-text search over filenames and paths. Supports FTS5 syntax ("exact phrase",term1 OR term2).--type archive-member: search inside indexed archives instead of top-level files.--disk: scope results to one registered disk by name.--limit: maximum number of results (default 200).
Search never requires root — the database is readable by the regular user.
dedup
ebdiskindex dedup [--disk <name>] [--min-size <bytes>] [--out <file>]
Find duplicate files by xxhash. Groups all files sharing the same hash, shows the wasted space, and lists every location the file appears across all registered disks. Requires a prior scan deep.
$ ebdiskindex dedup --min-size 1048576
Hash: a3f1c2d4e5b6a7b8 Size: 2.1 GB Copies: 3 Wasted: 4.2 GB
/dev/sdb1 /Backups/archive-2023.zip
/dev/sdc1 /Mirror/archive-2023.zip
/dev/sdd1 /Old/archive-2023.zip
info
ebdiskindex info [--disk <name>] [--partition-id <id>]
Show file counts, total size, archive member counts, and scan timestamps per partition.
Archive support
The following archive formats are read natively without any external tools:
| Format | Extension(s) | Notes |
|---|---|---|
| ZIP | .zip |
stdlib |
| TAR | .tar, .tar.gz, .tgz, .tar.bz2, .tbz2, .tar.xz, .txz |
stdlib + xz |
| RAR | .rar |
nwaples/rardecode |
| 7-Zip | .7z |
bodgit/sevenzip |
| ISO 9660 | .iso |
internal reader; records ExtentLBA for direct hashing |
| FAT disk image | .img |
go-diskfs; FAT12/16/32 |
| MBR disk image | .img |
walks each FAT partition in the MBR table |
| VDI container | .vdi |
extracts data region; lists FAT contents |
| VHD (fixed) | .vhd |
fixed-size only; dynamic VHD returns stub |
| QCOW2 | .qcow2 |
informational stub only |
| HFE floppy | .hfe |
MFM track decode → FAT extraction |
Filesystem support
Detection and mounting is done natively via superblock magic bytes and unix.Mount:
| Filesystem | Support |
|---|---|
| ext2/3/4 | native kernel |
| FAT12/16/32, exFAT | native kernel |
| NTFS | requires ntfs3 kernel module (Linux 5.15+) or ntfs-3g |
| Btrfs | native kernel |
| XFS | native kernel |
| HFS+ | native kernel module |
| ISO 9660 | native kernel (loop mount) |
| UDF | native kernel |
| F2FS | native kernel |
| squashfs | native kernel |
Notes
- All scanning is strictly read-only (
MS_RDONLY). - Paths in the database are relative to the device UUID, not the temporary mountpoint, so they remain stable across sessions even if the mountpoint changes.
- FTS5 triggers are disabled during scanning; the index is rebuilt in a single pass at the end.
- Archive members inside files ≥ 1 GB are skipped during fast scan to avoid blocking on large containers.
- The database is a plain SQLite file — you can query it directly with any SQLite client.
Old bash version
The original bash implementation is preserved in README.bash.md and ebdiskindex.bash.