Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

CLI Reference Overview

The wicket CLI extracts plain text from Wikipedia XML dump files.

Usage

wicket [OPTIONS] <INPUT>

Quick Reference

OptionDescriptionDefault
<INPUT>Input Wikipedia XML dump file (.xml or .xml.bz2)(required)
-o, --outputOutput directory, or - for stdouttext
-b, --bytesMaximum bytes per output file (e.g., 1M, 500K, 1G)1M
-c, --compressCompress output files using bzip2false
--jsonWrite output in JSON formatfalse
--processesNumber of parallel workersCPU count
-q, --quietSuppress progress output on stderrfalse
--namespacesComma-separated namespace IDs to extract0

Detailed Documentation

  • Options – detailed description of all CLI options
  • Examples – common usage patterns and examples