No description
Find a file
2026-04-04 23:59:28 -04:00
.github/workflows feat: distro readiness — security fixes, tests, CI, packaging 2026-04-03 23:38:44 -04:00
cmd fix: drop dead typ param from searchInteractive, remove chosenPtr indirection 2026-04-04 23:50:52 -04:00
debian feat: distro readiness — security fixes, tests, CI, packaging 2026-04-03 23:38:44 -04:00
docs feat: replace hand-rolled flag parser with pflag, restore browser binary 2026-04-04 22:24:02 -04:00
internal fix: resolve all 15 cmd-layer cleanup issues 2026-04-03 23:57:43 -04:00
packaging feat: distro readiness — security fixes, tests, CI, packaging 2026-04-03 23:38:44 -04:00
todos fix: resolve code review todos 139-154 (excl. arch refactors) 2026-04-03 16:26:26 -04:00
vendor feat: replace hand-rolled flag parser with pflag, restore browser binary 2026-04-04 22:24:02 -04:00
.gitignore chore: ignore browser build output in gitignore 2026-03-14 12:20:23 -04:00
.golangci.yml feat: distro readiness — security fixes, tests, CI, packaging 2026-04-03 23:38:44 -04:00
CHANGELOG.md feat: distro readiness — security fixes, tests, CI, packaging 2026-04-03 23:38:44 -04:00
CODEOFCONDUCT.md CLEAN UP 2026-03-10 16:34:42 -04:00
ebdiskindex.spec feat: distro readiness — security fixes, tests, CI, packaging 2026-04-03 23:38:44 -04:00
go.mod feat: replace hand-rolled flag parser with pflag, restore browser binary 2026-04-04 22:24:02 -04:00
go.sum feat: replace hand-rolled flag parser with pflag, restore browser binary 2026-04-04 22:24:02 -04:00
LICENSE.md CLEAN UP 2026-03-10 16:34:42 -04:00
Makefile fix: resolve all 15 cmd-layer cleanup issues 2026-04-03 23:57:43 -04:00
PKGBUILD feat: distro readiness — security fixes, tests, CI, packaging 2026-04-03 23:38:44 -04:00
README.md feat: distro readiness — security fixes, tests, CI, packaging 2026-04-03 23:38:44 -04:00

ebdiskindex

Index files on external drives without keeping them plugged in. Search by filename, find duplicates across disks, and browse inside zip/tar/rar/7z/ISO archives — all from a single SQLite database stored locally.

Requires root. Scanning mounts and unmounts block devices. Run scan commands with sudo, or as root. The database is stored in the invoking user's home directory even under sudo, so search commands work without root.

Requirements

  • Linux kernel 3.10+ (uses /sys/block and syscall mount)
  • Go 1.24+ (to build)
  • ntfs-3g or kernel ntfs3 module — for NTFS volumes (optional, auto-detected)

No other external tools required. All archive reading, hashing, device probing, and mounting are handled natively.

Build

git clone https://github.com/anomalyco/ebdiskindex
cd ebdiskindex
go build -o ebdiskindex ./cmd/ebdiskindex

Install system-wide:

sudo install -m 755 ebdiskindex /usr/local/bin/

Or run directly from the repo:

./ebdiskindex help

Database location

The database lives at ~/.local/share/ebdiskindex/index.db.

When running under sudo, the real user's home is resolved via $SUDO_USER, so the database is shared between privileged scan invocations and unprivileged search/dedup invocations. Override the directory with EBDISKINDEX_DB=/path/to/dir.

Quick start

# 1. Inspect the device — prints a ready-to-run disk add command
sudo ebdiskindex disk suggest --device /dev/sdb1
# Output: ebdiskindex disk add --name "MY_DISK" --description "931.5G ntfs" --device "/dev/sdb1"

# 2. Register it (copy the output from step 1 and run it)
ebdiskindex disk add --name "MY_DISK" --description "931.5G ntfs" --device /dev/sdb1

# 3. Fast scan — mounts read-only, walks filesystem, indexes archives
sudo ebdiskindex scan fast --device /dev/sdb1

# 4. Unplug the drive — search works fully offline
ebdiskindex search "vacation photos"

# 5. Deep scan — hashes every file for duplicate detection
sudo ebdiskindex scan deep --device /dev/sdb1

# 6. Find duplicates
ebdiskindex dedup --min-size 1048576

Commands

disk

ebdiskindex disk suggest --device /dev/sdXN
ebdiskindex disk add  --name <name> [--description <desc>] [--device /dev/sdXN]
ebdiskindex disk list
ebdiskindex disk edit --id <id> [--name <name>] [--description <desc>]

disk suggest inspects a device and prints a ready-to-run disk add command with the detected label, filesystem type, and size — no flags to look up manually:

$ sudo ebdiskindex disk suggest --device /dev/sdb1
ebdiskindex disk add --name "GIBSONBACKUPS" --description "931.5G ntfs" --device "/dev/sdb1"

disk suggest needs root only to read the block device superblock. disk add/list/edit do not need root.

partition

ebdiskindex partition add --disk-id <id> --device /dev/sdXN

Add additional partitions to an existing disk record. Useful for multi-partition drives.

scan

Requires sudo / root. The scan commands mount the device read-only, walk the filesystem, and unmount when done.

sudo ebdiskindex scan fast  --device /dev/sdXN [--partition-id <id>] [--keep-mount]
sudo ebdiskindex scan deep  --device /dev/sdXN [--partition-id <id>] [--keep-mount]
  • fast — mounts the device read-only, walks the directory tree, records filenames/paths/sizes/timestamps, lists members of archives (zip, tar, rar, 7z, ISO, disk images). Wipes and rebuilds from scratch on each run.
  • deep — runs a fast scan first (if needed), then hashes every file with xxhash. Required for dedup. Uses a worker pool: 1 worker for rotational disks, 4 for SSDs. Shows a per-file progress bar during hashing.
  • --keep-mount — leave the device mounted after the scan completes (useful when doing a fast scan immediately followed by a deep scan).
  • --partition-id — required when multiple partitions for the same device path are registered. The error message will list the available partition IDs.

A live TUI shows scan progress:

  Scanning  12,453 / 84,201  (14%)  148/s  ETA 483s  /Music/Albums/...
  ████████░░░░░░░░░░░░░░░░  3.2M / 48.1M
ebdiskindex search [query]
ebdiskindex search --query <fts5-expr> [--disk <name>] [--type files|archive-member] [--limit <n>]
  • Without arguments: opens an interactive search UI (type to filter, arrow keys to navigate).
  • --query: FTS5 full-text search over filenames and paths. Supports FTS5 syntax ("exact phrase", term1 OR term2).
  • --type archive-member: search inside indexed archives instead of top-level files.
  • --disk: scope results to one registered disk by name.
  • --limit: maximum number of results (default 200).

Search never requires root — the database is readable by the regular user.

dedup

ebdiskindex dedup [--disk <name>] [--min-size <bytes>] [--out <file>]

Find duplicate files by xxhash. Groups all files sharing the same hash, shows the wasted space, and lists every location the file appears across all registered disks. Requires a prior scan deep.

$ ebdiskindex dedup --min-size 1048576
Hash: a3f1c2d4e5b6a7b8  Size: 2.1 GB  Copies: 3  Wasted: 4.2 GB
  /dev/sdb1  /Backups/archive-2023.zip
  /dev/sdc1  /Mirror/archive-2023.zip
  /dev/sdd1  /Old/archive-2023.zip

info

ebdiskindex info [--disk <name>] [--partition-id <id>]

Show file counts, total size, archive member counts, and scan timestamps per partition.

Archive support

The following archive formats are read natively without any external tools:

Format Extension(s) Notes
ZIP .zip stdlib
TAR .tar, .tar.gz, .tgz, .tar.bz2, .tbz2, .tar.xz, .txz stdlib + xz
RAR .rar nwaples/rardecode
7-Zip .7z bodgit/sevenzip
ISO 9660 .iso internal reader; records ExtentLBA for direct hashing
FAT disk image .img go-diskfs; FAT12/16/32
MBR disk image .img walks each FAT partition in the MBR table
VDI container .vdi extracts data region; lists FAT contents
VHD (fixed) .vhd fixed-size only; dynamic VHD returns stub
QCOW2 .qcow2 informational stub only
HFE floppy .hfe MFM track decode → FAT extraction

Filesystem support

Detection and mounting is done natively via superblock magic bytes and unix.Mount:

Filesystem Support
ext2/3/4 native kernel
FAT12/16/32, exFAT native kernel
NTFS requires ntfs3 kernel module (Linux 5.15+) or ntfs-3g
Btrfs native kernel
XFS native kernel
HFS+ native kernel module
ISO 9660 native kernel (loop mount)
UDF native kernel
F2FS native kernel
squashfs native kernel

Notes

  • All scanning is strictly read-only (MS_RDONLY).
  • Paths in the database are relative to the device UUID, not the temporary mountpoint, so they remain stable across sessions even if the mountpoint changes.
  • FTS5 triggers are disabled during scanning; the index is rebuilt in a single pass at the end.
  • Archive members inside files ≥ 1 GB are skipped during fast scan to avoid blocking on large containers.
  • The database is a plain SQLite file — you can query it directly with any SQLite client.

Embedded components

Component Version License Notes
SQLite (via modernc.org/sqlite v1.46.1) 3.47.2 Public Domain Pure-Go transpilation; no system libsqlite3 required. CVEs are addressed by updating the modernc.org/sqlite module in go.mod.

Old bash version

The original bash implementation is preserved in README.bash.md and ebdiskindex.bash.