Data Operations

Command Palette

Search for a command to run...

Back

Data Operations

v1Skill

Dataset handling, file operations, data transformation

langfuseskilloptional
Edit
Capabilities

Generates code snippets and implementations

Content

Gene: Data Operations

Description

Data handling and manipulation capabilities. Combines HuggingFace datasets patterns with file operations for comprehensive data work.

Trigger Conditions

  • User needs to work with datasets
  • Data cleaning or transformation
  • File import/export operations
  • Data analysis requests

Capabilities

Dataset Operations

  • Load datasets (HuggingFace, CSV, JSON, Parquet)
  • Filter and select data
  • Transform columns
  • Handle missing values
  • Train/test splits

File Operations

  • Read/write various formats
  • Batch processing
  • File transformation
  • Directory traversal

Data Analysis

  • Summary statistics
  • Distribution analysis
  • Correlation analysis
  • Data visualization prep

Supported Formats

  • CSV/TSV
  • JSON/JSONL
  • Parquet
  • Excel (.xlsx)
  • HDF5 (for ML)

Execution Protocol

Step 1: Data Discovery

  • Identify data sources
  • Determine file formats
  • Assess data quality
  • Plan transformation pipeline

Step 2: Data Loading

  • Load from appropriate source
  • Parse format correctly
  • Handle encoding issues
  • Validate structure

Step 3: Transformation

  • Clean missing values
  • Encode categorical data
  • Normalize/standardize
  • Feature engineering

Step 4: Output

  • Export to desired format
  • Validate output
  • Document transformations

Best Practices

  • Preserve original data
  • Document all transformations
  • Validate at each step
  • Handle errors gracefully

Guardrails

  • Don't overwrite original data without backup
  • Handle sensitive data appropriately
  • Report data quality issues
  • Validate output completeness

Integration

  • Works with: code-execution, ml-pipeline, document-generation
  • Essential for data-driven tasks
Actions
Test in Playground