Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Workspace Structure

wicket is organized as a Cargo workspace with two crates and supporting directories.

Directory Layout

wicket/
├── Cargo.toml              # Workspace manifest
├── Cargo.lock              # Dependency lock file
├── LICENSE                 # MIT OR Apache-2.0
├── README.md               # Project overview
├── wicket/                # Core library crate
│   ├── Cargo.toml
│   └── src/
│       ├── lib.rs          # Module declarations and re-exports
│       ├── dump.rs         # XML dump streaming parser
│       ├── cleaner.rs      # Wikitext to plain text conversion
│       ├── extractor.rs    # Output formatting (doc/JSON)
│       ├── output.rs       # File splitting and rotation
│       └── error.rs        # Error types
├── wicket-cli/            # CLI binary crate
│   ├── Cargo.toml
│   └── src/
│       └── main.rs         # CLI entry point
├── docs/                   # mdBook documentation (this book)
│   ├── book.toml
│   ├── src/
│   └── ja/                 # Japanese documentation
│       ├── book.toml
│       └── src/
└── .github/
    └── workflows/          # CI/CD pipelines
        ├── regression.yml  # Test on push/PR
        ├── release.yml     # Release builds and publishing
        ├── periodic.yml    # Weekly stability tests
        └── deploy-docs.yml # Documentation deployment

Crate Details

wicket (Core Library)

The core library provides streaming XML parsing, wikitext cleaning, output formatting, and file splitting.

DependencyVersionPurpose
quick-xml0.39Streaming XML parsing
parse-wiki-text-20.2Wikitext AST parsing
regex1.12Fallback wikitext cleaning
bzip20.6Bzip2 compression/decompression
serde1.0Serialization framework
serde_json1.0JSON output formatting
rayon1.11Data parallelism (used by CLI)
thiserror2.0Error type derivation
log0.4Logging facade

wicket-cli (CLI Binary)

The CLI provides a command-line interface to wicket’s functionality.

DependencyVersionPurpose
clap4.5Command-line argument parsing
rayon1.11Parallel batch processing
bzip20.6Compressed output support
env_logger0.11Logging output
anyhow1.0Error handling in binary
wicket0.1Core library (workspace member)

Workspace Configuration

The workspace uses Cargo resolver version 3 (Rust Edition 2024):

[workspace]
resolver = "3"
members = ["wicket", "wicket-cli"]

[workspace.package]
version = "0.1.0"
edition = "2024"
license = "MIT OR Apache-2.0"

Shared dependencies are defined at the workspace level in [workspace.dependencies] and referenced by each crate with { workspace = true }.