Collaborative data analysis using Jobs

This guide explains how to work with colleagues on the same dataset using Projects and Jobs: how to share access, find each other’s processing runs, rerun workflows, and keep your work organized.

What gets shared when you collaborate

When you share a project with someone, they get access to:

Raw data and uploaded files in the project
Processed outputs generated by jobs (in the project folders)
The job history for that project (jobs run before and after they join)

That means your teammates can review, reproduce, and improve your analysis without exchanging files manually.

Roles: Manager vs Collaborator

A project can be shared with two permission levels:

Manager
- Can access data and jobs
- Can invite more people
- Best for team leads or project owners
Collaborator
- Can access data and jobs
- Can run analysis
- Cannot invite others

Use Managers sparingly to keep access controlled.

Go to Projects
Select the project
Find Managers / Collaborators
Click Edit
Add people as Collaborators or Managers

Tip: Add at least one other Manager as backup (vacations, handover).

Step 2 : Find your team’s jobs and outputs

Option A: Start from Jobs

Go to Analysis → Jobs
Use the available filters/search:
- Search by triggerer (the person who ran the job)
- Search by job template name (e.g., “Render Molecule”, “DIALS”, etc.)
Open a job to view:
- Input files
- Parameters used
- Logs
- Output files

Option B: Start from the project file space

Open the Project
Go to the Process (or output) folder
Each job typically creates its own output folder
Browse results (images, logs, reports, output datasets)

Use this when the job list is long and you want to quickly find “what was produced”.

Step 3 : Review a teammate’s job

For any job you can usually inspect:

Job Details (inputs, parameters, machine type)
Output files
Logs (useful for troubleshooting)

This is the fastest way to understand what a colleague did and whether the result looks correct.

Step 4 : Rerun a teammate’s job to iterate

Rerunning is the recommended way to collaborate:

You keep the same workflow and starting point
You can adjust parameters, inputs, or performance settings
The rerun creates a new output folder (so you won’t overwrite anyone’s results)

How to rerun

Open the job in Jobs
Click Rerun
Review the step-by-step flow:
- Inputs (verify file types are correct)
- Parameters (tune for improved results)
- Machine type/performance (CPU vs GPU, turbo vs standard)
Start the job

Result: Your rerun will appear in the job list with you as the triggerer.

Step 5 : Coordinate inside the project

Use the Project Logbook

Each project has a Logbook where the team can write:

Why someone was invited
What has been processed and what still needs doing
Which job IDs correspond to key results
Notes about “good parameter sets”

Suggested logbook template:

Goal / Dataset
Latest best job(s): Job ID(s) + short reason
What changed: parameters / inputs
Next steps / open questions
Owner & date

Add comments to jobs (soon!)

Use job comments to capture:

“This run used GPU for the movie render”
“Fixed wrong input file type”
“Best output: view=all + cartoon style”

Best practices for smooth collaboration

Name projects clearly (dataset + date + purpose)
Use consistent job templates across the team
Record “best run” job IDs in the logbook

Use output folders as the “source of truth” for results, and jobs as the reproducible history