Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Your first pipeline

This walkthrough moves a local CSV file to JSON Lines — no external services required, so it works immediately after cargo install faucet-cli.

1. Create some input

mkdir -p data out
cat > data/input.csv <<'CSV'
id,name,city
1,Ada,London
2,Grace,New York
3,Linus,Helsinki
CSV

2. Write a config

Create pipeline.yaml:

version: 1
name: csv_to_jsonl

pipeline:
  source:
    type: csv
    config:
      path: ./data/input.csv
  sink:
    type: jsonl
    config:
      path: ./out/records.jsonl

faucet run auto-discovers a faucet.yaml / faucet.yml / faucet.json in the current directory (and a sibling .env), so you can also name the file faucet.yaml and just run faucet run.

3. Validate, then run

faucet validate pipeline.yaml
faucet run pipeline.yaml
$ cat out/records.jsonl
{"id":"1","name":"Ada","city":"London"}
{"id":"2","name":"Grace","city":"New York"}
{"id":"3","name":"Linus","city":"Helsinki"}

4. Preview without writing

To see what a source emits without touching a sink, use preview — it runs the source and prints records to stdout:

faucet preview pipeline.yaml --limit 5

5. Scaffold from a connector’s schema

faucet init generates a commented config skeleton from any connector’s JSON schema, marking required fields and commenting out optional ones:

faucet init my_pipeline --source rest --sink postgres

Add a transform

Insert a transforms: list between source and sink to reshape records. For example, normalize keys to snake_case:

pipeline:
  source: { type: csv, config: { path: ./data/input.csv } }
  transforms:
    - type: snake_case
  sink: { type: jsonl, config: { path: ./out/records.jsonl } }

Built-in config transforms are flatten, rename_keys, and snake_case.

Next: core concepts.