Workflow Orchestration Integration
The workflow integration provides powerful YAML-based automation for complex repository management tasks, with support for directed acyclic graph (DAG) execution, conditional logic, and parallel processing.
Overview
Workflow orchestration enables you to:
- Automate Complex Tasks: Chain multiple repoindex commands into workflows
- Conditional Execution: Run steps based on conditions and previous results
- Parallel Processing: Execute independent tasks concurrently
- Scheduled Operations: Run workflows on schedules via cron
- Error Handling: Robust retry logic and failure recovery
- Template Reuse: Share workflow templates across projects
Quick Start
Run an Example Workflow
# Run the morning routine workflow
repoindex workflow run examples/workflows/morning-routine.yaml
# Run with variables
repoindex workflow run release.yaml --var version=1.2.0 --var branch=main
# Dry run to preview execution
repoindex workflow run complex-workflow.yaml --dry-run
# Run with verbose output
repoindex workflow run workflow.yaml --verbose
Create Your First Workflow
Create a file my-workflow.yaml:
name: Repository Health Check
description: Check and fix common repository issues
variables:
max_days: 30
fix_issues: false
steps:
- id: find-stale
name: Find stale repositories
action: repoindex.query
parameters:
query: "days_since_commit > {{ max_days }}"
- id: audit-stale
name: Audit stale repositories
action: repoindex.audit
parameters:
check: ["license", "readme", "security"]
fix: "{{ fix_issues }}"
depends_on: [find-stale]
when: "{{ steps.find-stale.output | length > 0 }}"
- id: report
name: Generate report
action: repoindex.export
parameters:
format: markdown
output: health-report.md
depends_on: [audit-stale]
Run it:
repoindex workflow run my-workflow.yaml --var fix_issues=true
Workflow Structure
Basic Structure
name: Workflow Name
description: What this workflow does
version: 1.0.0
# Global configuration
config:
max_parallel: 4
continue_on_error: false
timeout: 3600
# Variables with defaults
variables:
environment: development
deploy: false
# Workflow steps
steps:
- id: step-1
name: First Step
action: repoindex.list
parameters:
pretty: false
- id: step-2
name: Second Step
action: repoindex.status
depends_on: [step-1]
Step Definition
Each step can include:
- id: unique-identifier # Required: unique step ID
name: Human Readable Name # Required: display name
action: repoindex.command # Required: action to execute
parameters: # Optional: action parameters
key: value
depends_on: [step-1, step-2] # Optional: dependencies
when: "condition expression" # Optional: conditional execution
retry: # Optional: retry configuration
attempts: 3
delay: 5
timeout: 300 # Optional: step timeout in seconds
continue_on_error: true # Optional: continue if step fails
Actions
Built-in Actions
Repository Management
- action: repoindex.list # List repositories
- action: repoindex.status # Check repository status
- action: repoindex.update # Update repositories
- action: repoindex.clone # Clone repositories
Analysis Actions
- action: repoindex.audit # Audit repositories
- action: repoindex.cluster # Cluster analysis
- action: repoindex.query # Query repositories
- action: repoindex.stats # Generate statistics
Export Actions
- action: repoindex.export # Export to various formats
- action: repoindex.report # Generate reports
- action: repoindex.publish # Publish to platforms
Utility Actions
- action: shell # Execute shell commands
- action: http # Make HTTP requests
- action: wait # Wait for duration
- action: log # Log messages
- action: notify # Send notifications
Custom Actions
Create custom actions in Python:
# ~/.repoindex/actions/my_action.py
from repoindex.integrations.workflow import Action
class MyCustomAction(Action):
def execute(self, parameters, context):
# Your logic here
result = process_data(parameters['input'])
return {
'status': 'success',
'output': result
}
Use in workflow:
- id: custom
action: custom.my_action
parameters:
input: "{{ steps.previous.output }}"
Variables and Templating
Variable Definition
variables:
# Simple variables
environment: production
max_retries: 3
# Complex variables
config:
server: api.example.com
port: 443
# Lists
repos:
- repo1
- repo2
Variable Usage
steps:
- id: deploy
action: repoindex.deploy
parameters:
env: "{{ environment }}"
server: "{{ config.server }}:{{ config.port }}"
repos: "{{ repos }}"
Context Variables
Access workflow context:
- id: log-context
action: log
parameters:
message: |
Workflow: {{ workflow.name }}
Step: {{ step.id }}
Time: {{ workflow.start_time }}
Previous output: {{ steps.previous.output }}
Conditional Execution
Simple Conditions
- id: deploy
action: repoindex.deploy
when: "{{ environment == 'production' }}"
Complex Conditions
- id: notify
action: notify
when: |
{{
steps.test.status == 'success' and
environment in ['staging', 'production'] and
steps.audit.output.issues | length == 0
}}
Conditional Patterns
# Skip on failure
when: "{{ steps.previous.status == 'success' }}"
# Execute on specific day
when: "{{ workflow.date.weekday == 'Monday' }}"
# Check output size
when: "{{ steps.list.output | length > 10 }}"
# Complex logic
when: "{{ (a > b) or (c == 'd' and e != 'f') }}"
Dependencies and DAG Execution
Linear Dependencies
steps:
- id: step1
action: repoindex.list
- id: step2
action: repoindex.filter
depends_on: [step1]
- id: step3
action: repoindex.export
depends_on: [step2]
Parallel Execution
steps:
# These run in parallel
- id: audit-security
action: repoindex.audit
parameters:
check: security
- id: audit-license
action: repoindex.audit
parameters:
check: license
- id: audit-docs
action: repoindex.audit
parameters:
check: documentation
# This waits for all audits
- id: combine-reports
action: repoindex.combine
depends_on: [audit-security, audit-license, audit-docs]
Complex DAG
steps:
- id: init
action: repoindex.init
- id: fetch-a
action: repoindex.fetch
depends_on: [init]
- id: fetch-b
action: repoindex.fetch
depends_on: [init]
- id: process-a
action: repoindex.process
depends_on: [fetch-a]
- id: process-b
action: repoindex.process
depends_on: [fetch-b]
- id: merge
action: repoindex.merge
depends_on: [process-a, process-b]
- id: validate
action: repoindex.validate
depends_on: [merge]
- id: deploy
action: repoindex.deploy
depends_on: [validate]
when: "{{ steps.validate.status == 'success' }}"
Error Handling
Retry Configuration
steps:
- id: flaky-operation
action: http
parameters:
url: https://api.example.com/data
retry:
attempts: 3
delay: 5 # seconds
backoff: 2 # exponential backoff multiplier
max_delay: 60 # maximum delay between retries
Error Recovery
steps:
- id: main-operation
action: repoindex.deploy
continue_on_error: true
- id: fallback
action: repoindex.rollback
when: "{{ steps.main-operation.status == 'failed' }}"
- id: notify-error
action: notify
parameters:
message: "Deploy failed: {{ steps.main-operation.error }}"
when: "{{ steps.main-operation.status == 'failed' }}"
Global Error Handling
config:
continue_on_error: false # Stop on first error
on_failure:
- action: notify
parameters:
channel: alerts
message: "Workflow {{ workflow.name }} failed"
- action: repoindex.cleanup
Loops and Iteration
For Each Loop
variables:
repos:
- repo1
- repo2
- repo3
steps:
- id: process-repos
name: Process each repository
action: repoindex.foreach
parameters:
items: "{{ repos }}"
action: repoindex.audit
item_name: repo
parameters:
repository: "{{ item }}"
fix: true
While Loop
steps:
- id: wait-for-ready
action: repoindex.while
parameters:
condition: "{{ not ready }}"
max_iterations: 10
delay: 30
action: http
parameters:
url: https://api.example.com/status
Map Operation
steps:
- id: map-repos
action: repoindex.map
parameters:
items: "{{ steps.list.output }}"
expression: |
{
name: item.name,
status: item.status,
needs_update: item.behind > 0
}
Workflow Examples
Morning Routine
name: Morning Repository Routine
description: Daily repository maintenance tasks
steps:
- id: update-all
name: Update all repositories
action: repoindex.update
parameters:
recursive: true
fetch: true
- id: check-status
name: Check repository status
action: repoindex.status
parameters:
recursive: true
depends_on: [update-all]
- id: find-issues
name: Find repositories with issues
action: repoindex.query
parameters:
query: |
status.uncommitted_changes == true or
status.unpushed_commits == true or
days_since_commit > 30
depends_on: [check-status]
- id: generate-report
name: Generate morning report
action: repoindex.export
parameters:
format: markdown
template: morning-report
output: ~/reports/morning-{{ workflow.date }}.md
depends_on: [find-issues]
- id: notify
name: Send notification
action: notify
parameters:
type: email
subject: "Morning Report - {{ workflow.date }}"
body: "Found {{ steps.find-issues.output | length }} repos needing attention"
depends_on: [generate-report]
Release Pipeline
name: Release Pipeline
description: Automated release workflow
variables:
version: "{{ env.VERSION }}"
branch: main
deploy_env: production
steps:
- id: validate-version
name: Validate version number
action: shell
parameters:
command: |
if [[ ! "{{ version }}" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "Invalid version format"
exit 1
fi
- id: run-tests
name: Run test suite
action: shell
parameters:
command: pytest tests/ --cov
retry:
attempts: 2
- id: build
name: Build package
action: shell
parameters:
command: python -m build
depends_on: [run-tests]
- id: audit
name: Security audit
action: repoindex.audit
parameters:
check: [security, dependencies]
fail_on_issues: true
depends_on: [build]
- id: tag-release
name: Create git tag
action: shell
parameters:
command: |
git tag -a v{{ version }} -m "Release v{{ version }}"
git push origin v{{ version }}
depends_on: [audit]
when: "{{ deploy_env == 'production' }}"
- id: publish-pypi
name: Publish to PyPI
action: shell
parameters:
command: twine upload dist/*
depends_on: [tag-release]
when: "{{ deploy_env == 'production' }}"
- id: publish-docs
name: Deploy documentation
action: repoindex.docs
parameters:
command: deploy
platform: github-pages
depends_on: [tag-release]
- id: announce
name: Announce release
action: repoindex.social
parameters:
platforms: [twitter, linkedin]
message: "Released v{{ version }} with new features!"
depends_on: [publish-pypi, publish-docs]
Scheduling Workflows
Using Cron
# Add to crontab
0 9 * * * repoindex workflow run morning-routine.yaml
0 0 * * 0 repoindex workflow run weekly-cleanup.yaml
Using repoindex Service
{
"service": {
"workflows": [
{
"path": "morning-routine.yaml",
"schedule": "0 9 * * *"
},
{
"path": "weekly-cleanup.yaml",
"schedule": "0 0 * * 0"
}
]
}
}
Workflow Management
List Workflows
# List available workflows
repoindex workflow list
# List with details
repoindex workflow list --detailed
# Search workflows
repoindex workflow search "release"
Validate Workflows
# Validate workflow syntax
repoindex workflow validate my-workflow.yaml
# Validate with verbose output
repoindex workflow validate my-workflow.yaml --verbose
Workflow History
# Show workflow execution history
repoindex workflow history
# Show specific workflow history
repoindex workflow history --workflow release-pipeline
# Show failed runs
repoindex workflow history --status failed
Advanced Features
Workflow Composition
Include other workflows:
name: Master Workflow
steps:
- id: morning
action: repoindex.workflow
parameters:
file: morning-routine.yaml
- id: release
action: repoindex.workflow
parameters:
file: release-pipeline.yaml
variables:
version: "1.2.0"
when: "{{ workflow.date.weekday == 'Friday' }}"
Dynamic Step Generation
steps:
- id: generate-steps
action: repoindex.generate
parameters:
template: |
{% for repo in repos %}
- id: process-{{ repo.name }}
action: repoindex.audit
parameters:
repository: {{ repo.path }}
{% endfor %}
Workflow Templates
Create reusable templates:
# templates/audit-template.yaml
name: Audit Template
parameters:
- name: repository
required: true
- name: checks
default: [license, security]
steps:
- id: audit
action: repoindex.audit
parameters:
repository: "{{ parameters.repository }}"
check: "{{ parameters.checks }}"
Use template:
steps:
- id: use-template
template: templates/audit-template.yaml
parameters:
repository: my-repo
checks: [license, security, documentation]
Best Practices
- Modular Workflows: Break complex workflows into smaller, reusable components
- Error Handling: Always include error handling and recovery steps
- Logging: Add logging steps for debugging and monitoring
- Testing: Test workflows with --dry-run before production use
- Version Control: Store workflows in git with your project
- Documentation: Document workflow purpose, inputs, and outputs
- Idempotency: Design workflows to be safely re-runnable
Troubleshooting
Debug Mode
# Run with debug output
repoindex workflow run my-workflow.yaml --debug
# Save debug logs
repoindex workflow run my-workflow.yaml --debug --log-file debug.log
Common Issues
Step not executing
- Check
whenconditions - Verify dependencies are met
- Check for previous step failures
Variable not found
- Ensure variable is defined
- Check variable scope
- Verify template syntax
Timeout errors
- Increase step or global timeout
- Add retry configuration
- Break into smaller steps
API Reference
Python API
from repoindex.integrations.workflow import Workflow, WorkflowRunner
# Load workflow
workflow = Workflow.from_file('my-workflow.yaml')
# Set variables
workflow.set_variables({
'environment': 'production',
'version': '1.2.0'
})
# Run workflow
runner = WorkflowRunner()
result = runner.run(workflow)
# Check results
if result.success:
print(f"Workflow completed in {result.duration}s")
for step_id, step_result in result.steps.items():
print(f"{step_id}: {step_result.status}")
else:
print(f"Workflow failed: {result.error}")
CLI Reference
# Main commands
repoindex workflow run # Run a workflow
repoindex workflow validate # Validate workflow syntax
repoindex workflow list # List available workflows
repoindex workflow history # Show execution history
repoindex workflow debug # Debug workflow execution
# Common options
--dry-run # Preview without executing
--var KEY=VALUE # Set workflow variables
--verbose # Verbose output
--debug # Debug mode
--timeout SECONDS # Global timeout
--max-parallel N # Max parallel steps
Next Steps
- Explore example workflows
- Learn about Clustering Integration for analysis workflows
- Check Tutorial Notebooks for interactive examples
- See API Documentation for detailed reference