Building the Dashboard¶
The dashboard builder creates a self-contained interactive HTML file for exploring papers. No server required!
How It Works¶
The builder:
- Takes the final JSON with embeddings and metadata
- Embeds data in the HTML file (JSON + FAISS index)
- Includes JavaScript for interactivity
- Generates visualization with d3.js scatter plot
- Creates search index for full-text search
- Exports single HTML file ready to share
Basic Usage¶
Build Dashboard¶
Opens the dashboard in your browser:
Or just double-click the file!
Customize Title¶
Add Description¶
papertrail build papers_final.json -o dashboard.html \
--description "Papers shared in our Slack workspace"
Dashboard Features¶
Table View¶
- Sortable columns: Click header to sort
- Searchable: Use Ctrl+F in browser
- Scrollable: Vertical and horizontal scroll
- Columns: Title, authors, journal, year, citations, channel, date, engagement
Click a row to open the detail panel.
Embedding Map¶
2D Scatter Plot of papers using d3.js:
- Hover: See paper title and details
- Click: Open detail panel
- Zoom: Scroll to zoom in/out
- Pan: Click and drag to move around
Color by dropdown to switch coloring:
- Cluster: k-means clusters (auto-computed)
- Channel: Slack channel
- User: Who shared it
- Date: Timeline gradient
- Year: Publication year
- Citations: Citation count gradient
Projection dropdown to switch 2D projections:
- UMAP (recommended)
- t-SNE
- PCA
Detail Panel¶
Click a paper to see:
- Full title and authors
- Abstract
- Journal, year, citation count
- DOI, arXiv ID, URL
- Engagement metrics
- Channel and user who shared
- Direct link to original Slack message
Semantic Search¶
Chat-style search interface:
- Type a query: "transformer attention mechanisms"
- Results appear: Similar papers ranked by relevance
- Click result: Opens detail panel
- Autocomplete: Suggests paper titles as you type
Uses FAISS index for sub-millisecond search across all embeddings.
Advanced Options¶
Customize Colors¶
papertrail build papers_final.json -o dashboard.html \
--primary-color "#2196F3" \
--accent-color "#FF9800"
Set Default Projection¶
Set Default Coloring¶
Include Additional Metadata¶
Add custom fields to display:
papertrail build papers_final.json -o dashboard.html \
--extra-fields "keywords,institution,funding"
Data Size Optimization¶
For large datasets (10,000+ papers), compress data:
Reduces file size significantly with minimal quality loss.
Template Customization¶
Use a custom HTML template:
Sharing & Deployment¶
Share Locally¶
# Simply copy the file
cp dashboard.html ~/Dropbox/papers_dashboard.html
# Or email it
mail -a dashboard.html user@example.com
Host on Web Server¶
Then access at https://server.com/papers.html
GitHub Pages¶
# Commit to repo
git add dashboard.html
git commit -m "Update paper dashboard"
git push
# View at https://username.github.io/repo/dashboard.html
Google Drive / Dropbox¶
Simply upload the HTML file. These services will:
- Serve it directly
- Allow sharing via link
- Work in all browsers
Customization¶
Edit HTML Directly¶
The dashboard HTML is a single file. You can edit it:
<!-- Change the title -->
<title>My Research Papers</title>
<!-- Modify colors in CSS -->
<style>
.header { background-color: #2196F3; }
</style>
<!-- Add custom scripts -->
<script>
// Your custom JavaScript here
</script>
Modify Table Columns¶
Edit the data extraction section to show different fields:
Customize Search¶
Modify search weighting:
Performance Tips¶
For Large Datasets (10,000+ papers)¶
-
Enable compression:
-
Limit initial display of table (lazy load):
-
Use efficient projection (PCA is fastest):
Browser Optimization¶
For very large datasets (20,000+ papers):
- Use Chrome/Edge (faster d3.js rendering)
- Close other tabs
- Increase browser memory:
--max-old-space-size=4096
File Size¶
Check file size:
Typical sizes:
- 100 papers: 2-5 MB
- 1,000 papers: 20-50 MB
- 10,000 papers: 200-500 MB
Compress with gzip for storage:
Python API¶
Build dashboards programmatically:
from papertrail.preview import DashboardBuilder
# Create builder
builder = DashboardBuilder()
# Build dashboard
builder.build(
papers=papers_with_embeddings,
output_path="dashboard.html",
title="My Papers",
description="Papers from our Slack workspace",
default_coloring="cluster",
default_projection="umap"
)
Tips & Tricks¶
Export Data from Dashboard¶
In browser console (F12):
// Get all papers as JSON
const data = JSON.stringify(window.papers);
console.log(data);
// Copy to clipboard
copy(data);
Embed Dashboard in Website¶
Share with Export Settings¶
Create multiple dashboards with different defaults:
# One with cluster coloring
papertrail build papers_final.json -o dashboard_clusters.html \
--default-coloring cluster
# One with channel coloring
papertrail build papers_final.json -o dashboard_channels.html \
--default-coloring channel
# One with timeline coloring
papertrail build papers_final.json -o dashboard_timeline.html \
--default-coloring date
Add Search Filters¶
Modify HTML to add predefined search filters:
// Quick filter buttons
const quickFilters = {
"Deep Learning": "neural network deep learning",
"Biology": "cell biology genetics",
"Statistics": "statistical analysis hypothesis test"
};
Track Search Popularity¶
// Log popular searches
document.addEventListener("search", (e) => {
console.log(`Searched for: ${e.detail.query}`);
});
Troubleshooting¶
Dashboard is slow¶
Check:
- File size (compress if >200MB)
- Number of papers (UMAP is slower for 10,000+)
- Browser (use Chrome/Edge)
- Try PCA projection instead of UMAP
Search returns no results¶
Check:
- Papers have abstracts (required for search)
- Query words are spelled correctly
- Try shorter queries ("deep learning" vs "distributed deep learning systems")
Maps doesn't show¶
Check:
- Papers have embeddings (required)
- Projection was computed (--projections umap)
- Browser JavaScript is enabled
File size is huge¶
Solutions:
1. Compress: --compress
2. Use PCA (smaller embedding space)
3. Use HuggingFace backend (384D vs 1536D embeddings)
4. Reduce number of papers
Colors don't match expectations¶
Check:
- Coloring dropdown is set correctly
- Papers have required metadata (channel, date, etc.)
- Color palette is appropriate for your data
Next Steps¶
- Searching Papers — Use semantic search API
- Koo Lab Example — Real-world dashboard example
- API Reference: Preview — Detailed Python API