How a repetitive, click-by-click office task — DGIST receipt (expense) processing — was automated end-to-end using Claude with Computer Use. Built from the actual prompts used during the session.
Receipt processing on the DGIST portal is a perfect automation target: it is repetitive, rule-based, and done entirely by clicking buttons in a fixed order on a web page.
Before the steps, here is why the workflow is shaped the way it is. These transfer to almost any GUI-automation task.
Login and security steps stay manual. The script opens the login page; you log in yourself. Never hand credentials to automation.
Don't describe the whole flow at once. Drive one screen at a time: take a screenshot, tell Claude which button to click, verify, repeat.
login.py handles the session, work.py does the task, config.py holds the inputs. Each piece is testable on its own.
Employee numbers, approval numbers, and file paths become inputs in config.py, so the same script handles every receipt.
If a required input can't be extracted, the script stops instead of silently submitting a wrong claim.
After building the parts, run the whole pipeline — login → download → process — and watch it work before trusting it.
The session naturally broke into four phases. Each phase below links to the prompts that drove it.
Open the login page, log in manually, and keep the session alive so the work script can reuse it.
Use screenshots to define exactly which button to press on each portal page until the full claim is filled in.
Download the relevant Gmail messages & attachments, then extract the approval number from the receipt image.
Move inputs into config.py, fail on missing data, and run login → download → process end-to-end.
Get a working, logged-in browser that the automation can drive — with the login itself done by hand.
State the task and keep login manual. Describe the repetitive task and explicitly say which part you'll do yourself.
Correct the target page. The post-login URL was wrong, so it was fixed in one short instruction.
Split login from work, and persist the session. A recurring early problem: after manual login, the session wasn't being saved, so the work script saw a logged-out browser.
login.py와 업무용 work.py를 분리해줘."
work.py가 같은 세션을 재사용하도록 해줘."
This is the heart of the session. Each portal screen is handled with a screenshot + a precise instruction. Below are the recurring patterns rather than all 30+ clicks.
One screen, one screenshot, one instruction. The whole expense form was built up like this — selecting payers, project code, budget item, bank account, card-approval lookup, attaching proof, and saving.
When a click fails, describe the UI more precisely. Several screens needed a second, sharper instruction — this is normal and expected.
Help Claude notice pop-ups. A few times the automation didn't realize a new window had opened. A one-line nudge fixed it.
Respect ordering constraints the UI imposes. The portal required saving first before proof could be attached — a real-world quirk discovered mid-flow.
74955789) → Enter → double-click the matching row. A "real user" (실 사용자) field was also added later via the same magnifier → search → double-click pattern.
Instead of typing each receipt's data by hand, the inputs are pulled straight from email and the receipt image.
Download the relevant emails & attachments.
case_1, case_2, ... 로 저장해줘."
credentials.json from Google Cloud + adding yourself as a test user to clear the access_denied screen). Claude can walk you through the Cloud console screens.
Extract data from the attachments. Read images out of the .hwp meeting minutes, and pull the approval number directly from the receipt.
Turn the one-off script into a reusable tool, and confirm the whole pipeline works.
Move inputs into a config file — and fail if they're missing.
work.py가 직원번호 · 승인번호 · 파일경로를 입력으로 받게 해줘. 이 값들을 config.py에 두고 work.py가 이를 읽어 실행하도록 해줘. 값 추출에 실패하면 (잘못 작성되면) 실패로 처리해줘."
Run and verify the full pipeline.
A checklist for turning any of your repetitive tasks into an automation.
The best first target is something you do the same way every time — forms, data entry, downloading and filing attachments.
Walk through the task once with screenshots and plain-language instructions. You don't need to know the page's HTML — just describe what you see.
Log in yourself; review before the final "submit". Automation fills the form — you stay responsible for sending it.
Dropdowns, pop-ups, and look-alike buttons may take a second, more precise instruction. That back-and-forth is the normal cost of building a reliable script.
Pull the changing values into a config file, make missing data a hard error, then reuse the same script for every case (case_1, case_2, ...).