科研技能库/Matplotlib 数据可视化库
图表可视化
高风险

Matplotlib 数据可视化库

底层绘图库,支持完全自定义。当需要对每个绘图元素进行精细控制、创建新颖图表类型或集成特定科学工作流时使用。可导出为 PNG/PDF/SVG 用于出版。如需快速统计图表,使用 seaborn;如需交互式图表,使用 plotly;如需带有期刊样式的出版级多面板图,使用 scientific-visualization。

文件预览

7 个文件
references
scripts
SKILL.md
11.2 KB · 可预览
---
name: matplotlib
description: Low-level plotting library for full customization. Use when you need fine-grained control over every plot element, creating novel plot types, or integrating with specific scientific workflows. Export to PNG/PDF/SVG for publication. For quick statistical plots use seaborn; for interactive plots use plotly; for publication-ready multi-panel figures with journal styling, use scientific-visualization.
license: https://github.com/matplotlib/matplotlib/tree/main/LICENSE
metadata:
  version: "1.0"
  skill-author: K-Dense Inc.
---

# Matplotlib

## Overview

Matplotlib is Python's foundational visualization library for creating static, animated, and interactive plots. This skill provides guidance on using matplotlib effectively, covering both the pyplot interface (MATLAB-style) and the object-oriented API (Figure/Axes), along with best practices for creating publication-quality visualizations.

## When to Use This Skill

This skill should be used when:
- Creating any type of plot or chart (line, scatter, bar, histogram, heatmap, contour, etc.)
- Generating scientific or statistical visualizations
- Customizing plot appearance (colors, styles, labels, legends)
- Creating multi-panel figures with subplots
- Exporting visualizations to various formats (PNG, PDF, SVG, etc.)
- Building interactive plots or animations
- Working with 3D visualizations
- Integrating plots into Jupyter notebooks or GUI applications

## Core Concepts

### The Matplotlib Hierarchy

Matplotlib uses a hierarchical structure of objects:

1. **Figure** - The top-level container for all plot elements
2. **Axes** - The actual plotting area where data is displayed (one Figure can contain multiple Axes)
3. **Artist** - Everything visible on the figure (lines, text, ticks, etc.)
4. **Axis** - The number line objects (x-axis, y-axis) that handle ticks and labels

### Two Interfaces

**1. pyplot Interface (Implicit, MATLAB-style)**
```python
import matplotlib.pyplot as plt

plt.plot([1, 2, 3, 4])
plt.ylabel('some numbers')
plt.show()
```
- Convenient for quick, simple plots
- Maintains state automatically
- Good for interactive work and simple scripts

**2. Object-Oriented Interface (Explicit)**
```python
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
ax.plot([1, 2, 3, 4])
ax.set_ylabel('some numbers')
plt.show()
```
- **Recommended for most use cases**
- More explicit control over figure and axes
- Better for complex figures with multiple subplots
- Easier to maintain and debug

## Common Workflows

### 1. Basic Plot Creation

**Single plot workflow:**
```python
import matplotlib.pyplot as plt
import numpy as np

# Create figure and axes (OO interface - RECOMMENDED)
fig, ax = plt.subplots(figsize=(10, 6))

# Generate and plot data
x = np.linspace(0, 2*np.pi, 100)
ax.plot(x, np.sin(x), label='sin(x)')
ax.plot(x, np.cos(x), label='cos(x)')

# Customize
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_title('Trigonometric Functions')
ax.legend()
ax.grid(True, alpha=0.3)

# Save and/or display
plt.savefig('plot.png', dpi=300, bbox_inches='tight')
plt.show()
```

### 2. Multiple Subplots

**Creating subplot layouts:**
```python
# Method 1: Regular grid
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
axes[0, 0].plot(x, y1)
axes[0, 1].scatter(x, y2)
axes[1, 0].bar(categories, values)
axes[1, 1].hist(data, bins=30)

# Method 2: Mosaic layout (more flexible)
fig, axes = plt.subplot_mosaic([['left', 'right_top'],
                                 ['left', 'right_bottom']],
                                figsize=(10, 8))
axes['left'].plot(x, y)
axes['right_top'].scatter(x, y)
axes['right_bottom'].hist(data)

# Method 3: GridSpec (maximum control)
from matplotlib.gridspec import GridSpec
fig = plt.figure(figsize=(12, 8))
gs = GridSpec(3, 3, figure=fig)
ax1 = fig.add_subplot(gs[0, :])  # Top row, all columns
ax2 = fig.add_subplot(gs[1:, 0])  # Bottom two rows, first column
ax3 = fig.add_subplot(gs[1:, 1:])  # Bottom two rows, last two columns
```

### 3. Plot Types and Use Cases

**Line plots** - Time series, continuous data, trends
```python
ax.plot(x, y, linewidth=2, linestyle='--', marker='o', color='blue')
```

**Scatter plots** - Relationships between variables, correlations
```python
ax.scatter(x, y, s=sizes, c=colors, alpha=0.6, cmap='viridis')
```

**Bar charts** - Categorical comparisons
```python
ax.bar(categories, values, color='steelblue', edgecolor='black')
# For horizontal bars:
ax.barh(categories, values)
```

**Histograms** - Distributions
```python
ax.hist(data, bins=30, edgecolor='black', alpha=0.7)
```

**Heatmaps** - Matrix data, correlations
```python
im = ax.imshow(matrix, cmap='coolwarm', aspect='auto')
plt.colorbar(im, ax=ax)
```

**Contour plots** - 3D data on 2D plane
```python
contour = ax.contour(X, Y, Z, levels=10)
ax.clabel(contour, inline=True, fontsize=8)
```

**Box plots** - Statistical distributions
```python
ax.boxplot([data1, data2, data3], labels=['A', 'B', 'C'])
```

**Violin plots** - Distribution densities
```python
ax.violinplot([data1, data2, data3], positions=[1, 2, 3])
```

For comprehensive plot type examples and variations, refer to `references/plot_types.md`.

### 4. Styling and Customization

**Color specification methods:**
- Named colors: `'red'`, `'blue'`, `'steelblue'`
- Hex codes: `'#FF5733'`
- RGB tuples: `(0.1, 0.2, 0.3)`
- Colormaps: `cmap='viridis'`, `cmap='plasma'`, `cmap='coolwarm'`

**Using style sheets:**
```python
plt.style.use('seaborn-v0_8-darkgrid')  # Apply predefined style
# Available styles: 'ggplot', 'bmh', 'fivethirtyeight', etc.
print(plt.style.available)  # List all available styles
```

**Customizing with rcParams:**
```python
plt.rcParams['font.size'] = 12
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['axes.titlesize'] = 16
plt.rcParams['xtick.labelsize'] = 10
plt.rcParams['ytick.labelsize'] = 10
plt.rcParams['legend.fontsize'] = 12
plt.rcParams['figure.titlesize'] = 18
```

**Text and annotations:**
```python
ax.text(x, y, 'annotation', fontsize=12, ha='center')
ax.annotate('important point', xy=(x, y), xytext=(x+1, y+1),
            arrowprops=dict(arrowstyle='->', color='red'))
```

For detailed styling options and colormap guidelines, see `references/styling_guide.md`.

### 5. Saving Figures

**Export to various formats:**
```python
# High-resolution PNG for presentations/papers
plt.savefig('figure.png', dpi=300, bbox_inches='tight', facecolor='white')

# Vector format for publications (scalable)
plt.savefig('figure.pdf', bbox_inches='tight')
plt.savefig('figure.svg', bbox_inches='tight')

# Transparent background
plt.savefig('figure.png', dpi=300, bbox_inches='tight', transparent=True)
```

**Important parameters:**
- `dpi`: Resolution (300 for publications, 150 for web, 72 for screen)
- `bbox_inches='tight'`: Removes excess whitespace
- `facecolor='white'`: Ensures white background (useful for transparent themes)
- `transparent=True`: Transparent background

### 6. Working with 3D Plots

```python
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')

# Surface plot
ax.plot_surface(X, Y, Z, cmap='viridis')

# 3D scatter
ax.scatter(x, y, z, c=colors, marker='o')

# 3D line plot
ax.plot(x, y, z, linewidth=2)

# Labels
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
```

## Best Practices

### 1. Interface Selection
- **Use the object-oriented interface** (fig, ax = plt.subplots()) for production code
- Reserve pyplot interface for quick interactive exploration only
- Always create figures explicitly rather than relying on implicit state

### 2. Figure Size and DPI
- Set figsize at creation: `fig, ax = plt.subplots(figsize=(10, 6))`
- Use appropriate DPI for output medium:
  - Screen/notebook: 72-100 dpi
  - Web: 150 dpi
  - Print/publications: 300 dpi

### 3. Layout Management
- Use `constrained_layout=True` or `tight_layout()` to prevent overlapping elements
- `fig, ax = plt.subplots(constrained_layout=True)` is recommended for automatic spacing

### 4. Colormap Selection
- **Sequential** (viridis, plasma, inferno): Ordered data with consistent progression
- **Diverging** (coolwarm, RdBu): Data with meaningful center point (e.g., zero)
- **Qualitative** (tab10, Set3): Categorical/nominal data
- Avoid rainbow colormaps (jet) - they are not perceptually uniform

### 5. Accessibility
- Use colorblind-friendly colormaps (viridis, cividis)
- Add patterns/hatching for bar charts in addition to colors
- Ensure sufficient contrast between elements
- Include descriptive labels and legends

### 6. Performance
- For large datasets, use `rasterized=True` in plot calls to reduce file size
- Use appropriate data reduction before plotting (e.g., downsample dense time series)
- For animations, use blitting for better performance

### 7. Code Organization
```python
# Good practice: Clear structure
def create_analysis_plot(data, title):
    """Create standardized analysis plot."""
    fig, ax = plt.subplots(figsize=(10, 6), constrained_layout=True)

    # Plot data
    ax.plot(data['x'], data['y'], linewidth=2)

    # Customize
    ax.set_xlabel('X Axis Label', fontsize=12)
    ax.set_ylabel('Y Axis Label', fontsize=12)
    ax.set_title(title, fontsize=14, fontweight='bold')
    ax.grid(True, alpha=0.3)

    return fig, ax

# Use the function
fig, ax = create_analysis_plot(my_data, 'My Analysis')
plt.savefig('analysis.png', dpi=300, bbox_inches='tight')
```

## Quick Reference Scripts

This skill includes helper scripts in the `scripts/` directory:

### `plot_template.py`
Template script demonstrating various plot types with best practices. Use this as a starting point for creating new visualizations.

**Usage:**
```bash
python scripts/plot_template.py
```

### `style_configurator.py`
Interactive utility to configure matplotlib style preferences and generate custom style sheets.

**Usage:**
```bash
python scripts/style_configurator.py
```

## Detailed References

For comprehensive information, consult the reference documents:

- **`references/plot_types.md`** - Complete catalog of plot types with code examples and use cases
- **`references/styling_guide.md`** - Detailed styling options, colormaps, and customization
- **`references/api_reference.md`** - Core classes and methods reference
- **`references/common_issues.md`** - Troubleshooting guide for common problems

## Integration with Other Tools

Matplotlib integrates well with:
- **NumPy/Pandas** - Direct plotting from arrays and DataFrames
- **Seaborn** - High-level statistical visualizations built on matplotlib
- **Jupyter** - Interactive plotting with `%matplotlib inline` or `%matplotlib widget`
- **GUI frameworks** - Embedding in Tkinter, Qt, wxPython applications

## Common Gotchas

1. **Overlapping elements**: Use `constrained_layout=True` or `tight_layout()`
2. **State confusion**: Use OO interface to avoid pyplot state machine issues
3. **Memory issues with many figures**: Close figures explicitly with `plt.close(fig)`
4. **Font warnings**: Install fonts or suppress warnings with `plt.rcParams['font.sans-serif']`
5. **DPI confusion**: Remember that figsize is in inches, not pixels: `pixels = dpi * inches`

## Additional Resources

- Official documentation: https://matplotlib.org/
- Gallery: https://matplotlib.org/stable/gallery/index.html
- Cheatsheets: https://matplotlib.org/cheatsheets/
- Tutorials: https://matplotlib.org/stable/tutorials/index.html

SKILL.md

元数据
namematplotlib
description底层绘图库,支持完全自定义。当需要对每个绘图元素进行精细控制、创建新颖图表类型或集成特定科学工作流时使用。可导出为PNG/PDF/SVG用于出版。如需快速统计图表,使用seaborn;如需交互式图表,使用plotly;如需带有期刊样式的出版级多面板图,使用scientific-visualization。
licensehttps://github.com/matplotlib/matplotlib/tree/main/LICENSE
metadata{ "version": "1.0", "skill-author": "K-Dense Inc." }

Matplotlib

概述

Matplotlib 是 Python 的基础可视化库,用于创建静态、动画和交互式图表。本技能提供有效使用 matplotlib 的指南,涵盖 pyplot 接口(MATLAB 风格)和面向对象 API(Figure/Axes),以及创建出版质量可视化的最佳实践。

何时使用本技能

在以下情况下应使用本技能:

  • 创建任何类型的图表(折线图、散点图、柱状图、直方图、热力图、等高线等)
  • 生成科学或统计可视化
  • 自定义图表外观(颜色、样式、标签、图例)
  • 使用子图创建多面板图形
  • 将可视化导出为各种格式(PNG、PDF、SVG 等)
  • 构建交互式图表或动画
  • 使用 3D 可视化
  • 将图表集成到 Jupyter notebook 或 GUI 应用程序中

核心概念

Matplotlib 层次结构

Matplotlib 使用对象的层次结构:

  1. Figure - 所有绘图元素的顶层容器
  2. Axes - 实际显示数据的绘图区域(一个 Figure 可包含多个 Axes)
  3. Artist - 图形上可见的一切元素(线条、文本、刻度等)
  4. Axis - 处理刻度和标签的数轴对象(x 轴、y 轴)

两种接口

1. pyplot 接口(隐式,MATLAB 风格)

python
import matplotlib.pyplot as plt

plt.plot([1, 2, 3, 4])
plt.ylabel('some numbers')
plt.show()
  • 方便快速绘制简单图表
  • 自动维护状态
  • 适合交互工作和简单脚本

2. 面向对象接口(显式)

python
import matplotlib.pyplot as plt

fig, ax = plt.subplots()
ax.plot([1, 2, 3, 4])
ax.set_ylabel('some numbers')
plt.show()
  • 推荐用于大多数场景
  • 对图形和坐标轴有更明确的控制
  • 更适合具有多个子图的复杂图形
  • 更易于维护和调试

常见工作流

1. 基本图表创建

单个图表工作流:

python
import matplotlib.pyplot as plt
import numpy as np

# 创建图形和坐标轴(推荐使用面向对象接口)
fig, ax = plt.subplots(figsize=(10, 6))

# 生成并绘制数据
x = np.linspace(0, 2*np.pi, 100)
ax.plot(x, np.sin(x), label='sin(x)')
ax.plot(x, np.cos(x), label='cos(x)')

# 自定义
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_title('三角函数')
ax.legend()
ax.grid(True, alpha=0.3)

# 保存和/或显示
plt.savefig('plot.png', dpi=300, bbox_inches='tight')
plt.show()

2. 多个子图

创建子图布局:

python
# 方法1:规则网格
fig, axes = plt.subplots(2, 2, figsize=(12, 10))
axes[0, 0].plot(x, y1)
axes[0, 1].scatter(x, y2)
axes[1, 0].bar(categories, values)
axes[1, 1].hist(data, bins=30)

# 方法2:马赛克布局(更灵活)
fig, axes = plt.subplot_mosaic([['left', 'right_top'],
                                 ['left', 'right_bottom']],
                                figsize=(10, 8))
axes['left'].plot(x, y)
axes['right_top'].scatter(x, y)
axes['right_bottom'].hist(data)

# 方法3:GridSpec(最大控制)
from matplotlib.gridspec import GridSpec
fig = plt.figure(figsize=(12, 8))
gs = GridSpec(3, 3, figure=fig)
ax1 = fig.add_subplot(gs[0, :])  # 第一行,所有列
ax2 = fig.add_subplot(gs[1:, 0])  # 下两行,第一列
ax3 = fig.add_subplot(gs[1:, 1:])  # 下两行,最后两列

3. 图表类型和用例

折线图 - 时间序列、连续数据、趋势

python
ax.plot(x, y, linewidth=2, linestyle='--', marker='o', color='blue')

散点图 - 变量之间的关系、相关性

python
ax.scatter(x, y, s=sizes, c=colors, alpha=0.6, cmap='viridis')

柱状图 - 类别比较

python
ax.bar(categories, values, color='steelblue', edgecolor='black')
# 水平柱状图:
ax.barh(categories, values)

直方图 - 分布

python
ax.hist(data, bins=30, edgecolor='black', alpha=0.7)

热力图 - 矩阵数据、相关性

python
im = ax.imshow(matrix, cmap='coolwarm', aspect='auto')
plt.colorbar(im, ax=ax)

等高线图 - 二维平面上的三维数据

python
contour = ax.contour(X, Y, Z, levels=10)
ax.clabel(contour, inline=True, fontsize=8)

箱线图 - 统计分布

python
ax.boxplot([data1, data2, data3], labels=['A', 'B', 'C'])

小提琴图 - 分布密度

python
ax.violinplot([data1, data2, data3], positions=[1, 2, 3])

有关全面的图表类型示例和变体,请参阅 references/plot_types.md

4. 样式和自定义

颜色指定方法:

  • 命名颜色:'red''blue''steelblue'
  • 十六进制代码:'#FF5733'
  • RGB 元组:(0.1, 0.2, 0.3)
  • 颜色映射:cmap='viridis'cmap='plasma'cmap='coolwarm'

使用样式表:

python
plt.style.use('seaborn-v0_8-darkgrid')  # 应用预定义样式
# 可用样式:'ggplot'、'bmh'、'fivethirtyeight' 等
print(plt.style.available)  # 列出所有可用样式

使用 rcParams 自定义:

python
plt.rcParams['font.size'] = 12
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['axes.titlesize'] = 16
plt.rcParams['xtick.labelsize'] = 10
plt.rcParams['ytick.labelsize'] = 10
plt.rcParams['legend.fontsize'] = 12
plt.rcParams['figure.titlesize'] = 18

文本和注释:

python
ax.text(x, y, 'annotation', fontsize=12, ha='center')
ax.annotate('important point', xy=(x, y), xytext=(x+1, y+1),
            arrowprops=dict(arrowstyle='->', color='red'))

有关详细的样式选项和颜色映射指南,请参阅 references/styling_guide.md

5. 保存图形

导出为各种格式:

python
# 用于演示/论文的高分辨率 PNG
plt.savefig('figure.png', dpi=300, bbox_inches='tight', facecolor='white')

# 用于出版物的矢量格式(可缩放)
plt.savefig('figure.pdf', bbox_inches='tight')
plt.savefig('figure.svg', bbox_inches='tight')

# 透明背景
plt.savefig('figure.png', dpi=300, bbox_inches='tight', transparent=True)

重要参数:

  • dpi:分辨率(出版物用 300,网页用 150,屏幕用 72)
  • bbox_inches='tight':去除多余空白
  • facecolor='white':确保白色背景(对透明主题有用)
  • transparent=True:透明背景

6. 使用 3D 图形

python
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')

# 曲面图
ax.plot_surface(X, Y, Z, cmap='viridis')

# 3D 散点图
ax.scatter(x, y, z, c=colors, marker='o')

# 3D 线图
ax.plot(x, y, z, linewidth=2)

# 标签
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')

最佳实践

1. 接口选择

  • 使用面向对象接口fig, ax = plt.subplots())用于生产代码
  • 仅将 pyplot 接口用于快速交互式探索
  • 始终显式创建图形,而不是依赖隐式状态

2. 图形尺寸和 DPI

  • 在创建时设置 figsize:fig, ax = plt.subplots(figsize=(10, 6))
  • 根据输出媒介使用适当的 DPI:
    • 屏幕/笔记本:72-100 dpi
    • Web:150 dpi
    • 印刷/出版物:300 dpi

3. 布局管理

  • 使用 constrained_layout=Truetight_layout() 防止元素重叠
  • 推荐使用 fig, ax = plt.subplots(constrained_layout=True) 实现自动间距

4. 颜色映射选择

  • 顺序(viridis、plasma、inferno):具有一致进展的有序数据
  • 发散(coolwarm、RdBu):具有有意义中心点(如零)的数据
  • 定性(tab10、Set3):分类/名义数据
  • 避免使用彩虹色映射(jet)—— 它们不符合感知均匀性

5. 可访问性

  • 使用对色盲友好的颜色映射(viridis、cividis)
  • 对于柱状图,除了颜色外,还添加图案/填充线
  • 确保元素之间有足够的对比度
  • 包含描述性标签和图例

6. 性能

  • 对于大数据集,在绘图调用中使用 rasterized=True 以减小文件大小
  • 在绘图前进行适当的数据降采样(例如,对密集的时间序列进行降采样)
  • 对于动画,使用双缓冲以获得更好的性能

7. 代码组织

python
# 良好实践:结构清晰
def create_analysis_plot(data, title):
    """创建标准化的分析图。"""
    fig, ax = plt.subplots(figsize=(10, 6), constrained_layout=True)

    # 绘制数据
    ax.plot(data['x'], data['y'], linewidth=2)

    # 自定义
    ax.set_xlabel('X 轴标签', fontsize=12)
    ax.set_ylabel('Y 轴标签', fontsize=12)
    ax.set_title(title, fontsize=14, fontweight='bold')
    ax.grid(True, alpha=0.3)

    return fig, ax

# 使用该函数
fig, ax = create_analysis_plot(my_data, '我的分析')
plt.savefig('analysis.png', dpi=300, bbox_inches='tight')

快速参考脚本

本技能在 scripts/ 目录下包含辅助脚本:

plot_template.py

模板脚本,演示各种图表类型及最佳实践。可用作创建新可视化的起点。

用法:

bash
python scripts/plot_template.py

style_configurator.py

交互式工具,用于配置 matplotlib 样式首选项并生成自定义样式表。

用法:

bash
python scripts/style_configurator.py

详细参考

如需全面信息,请查阅参考文档:

  • references/plot_types.md - 完整的图表类型目录,包含代码示例和用例
  • references/styling_guide.md - 详细的样式选项、颜色映射和自定义
  • references/api_reference.md - 核心类和方法的参考
  • references/common_issues.md - 常见问题的故障排除指南

与其他工具的集成

Matplotlib 可以与以下工具良好集成:

  • NumPy/Pandas - 直接从数组和 DataFrame 绘图
  • Seaborn - 基于 matplotlib 的高级统计可视化
  • Jupyter - 使用 %matplotlib inline%matplotlib widget 进行交互式绘图
  • GUI 框架 - 嵌入到 Tkinter、Qt、wxPython 应用程序中

常见陷阱

  1. 元素重叠:使用 constrained_layout=Truetight_layout()
  2. 状态混淆:使用面向对象接口以避免 pyplot 状态机问题
  3. 多个图形的内存问题:使用 plt.close(fig) 显式关闭图形
  4. 字体警告:安装字体或使用 plt.rcParams['font.sans-serif'] 抑制警告
  5. DPI 混淆:记住 figsize 的单位是英寸,不是像素:pixels = dpi * inches

其他资源