Privacy Policy
Effective date: 2026-05-03
This Privacy Policy explains how Astrafield LLC (“Astrafield,” “we,” “us,” or “our”), operating under the trade name CodebaseLM, collects, uses, shares, and protects information when you visit codebaselm.ai or use the CodebaseLM service (the “Service”). CodebaseLM is a voice-first AI tool for exploring GitHub repositories. By using the Service, you agree to the practices described here.
1. Who we are
CodebaseLM is a product of Astrafield LLC, a US limited liability company. Astrafield is the “data controller” for personal information processed through the Service. If you have questions about this policy or our data practices, see “Contact us” at the bottom of this page.
2. Data we collect
We try to collect only what we need to run the Service. The categories below describe what we may collect when you use CodebaseLM.
Account information
When you sign in, we use Google OAuth or GitHub OAuth (via NextAuth v5). These providers send us a basic profile: your name, email address, profile image, and a stable user identifier. For GitHub sign-in, we also receive an access token scoped to the permissions you grant.
Repository data
To generate a tour, we access repository content via the GitHub API. For public repositories, we use a server-side installation token. For private repositories (where supported), we use the OAuth user token you authorized at sign-in. We read code, commit history, README and documentation files, and high-level repository metadata. We do not use your code to train AI models.
Voice audio and transcripts
When you ask a question by voice, your audio is streamed to a third-party speech-recognition provider for transcription. The assistant's spoken reply is generated through a third-party speech-synthesis provider and streamed back to your browser. We retain the resulting text transcripts as part of your tour record so you can revisit the conversation; the underlying audio segments are not stored long-term (see “Data retention”).
Conversation history and Tour records
For each tour you start, we store a Tour row in our database. This includes the repository you toured, a question count (qaCount), the transcript of your questions and the assistant's answers, and metadata such as timestamps.
Billing information
Paid plans are processed by Polar.sh, which acts as our Merchant of Record. Polar collects and stores your payment-method details, billing address, and tax information. We never see or store your full card number; we only receive a subscription identifier, plan, and status from Polar.
Cookies and local storage
We use a small number of cookies and browser-side storage keys. See “Cookies & local storage” below for the full list.
Technical and usage data
When you use the Service, our servers and our analytics provider automatically receive: IP address, user-agent string, approximate location derived from IP, pages and features used, and timestamps. We use PostHog for product analytics; PostHog events include page views, button clicks, and a hashed user identifier so we can understand how the product is used.
3. How we use your data
- Provide the Service. Authenticate you, fetch repository data, run AI generations, transcribe and synthesize voice, and save tours so you can revisit them.
- Improve our prompts and product. We review aggregated and anonymized usage signals (and, in limited cases, opted-in transcripts) to refine the AI prompts and ranking we use. We do not use your code or transcripts to train third-party foundation models.
- Billing. Process subscriptions and entitlement checks via Polar.
- Abuse prevention and security. Detect spam, brute-force sign-in attempts, scraping, and other abuse; enforce rate limits and our Terms of Service.
- Communication. Send transactional emails (sign-in confirmations, billing receipts, important account or security notices). We will only send marketing email if you opt in.
- Legal compliance. Comply with applicable laws, lawful requests, and our legal obligations.
4. Third-party processors
We rely on a small set of vendor categories to operate CodebaseLM. Each processor handles your data only on our behalf and under its own privacy policy. Where a vendor is named below, it's because you interact with that vendor's name directly (on the sign-in screen, the checkout page, or via standard analytics disclosures).
- AI language model providers — used to generate tour narration and answer your questions. Your questions and the relevant repository excerpts are sent to these providers to produce the response. We do not allow these providers to train their models on your data.
- Speech providers — used for speech-to-text transcription of your voice questions and text-to-speech synthesis of the assistant's replies. Audio is processed in transit; see “Data retention”.
- Cloud hosting and database — used to run the application, store your account, and store tour records. Infrastructure is located in cloud regions in the United States.
- OAuth identity providers — sign-in is handled by GitHub and Google. GitHub is also the source of repository content you ask us to tour.
- Payments — Polar.sh is our Merchant of Record for paid plans. Polar processes payment details, billing addresses, and tax information.
- Product analytics — PostHog — used to understand how the product is used (page views, button clicks, hashed user identifiers). Can be disabled via the in-product analytics opt-out.
We may add or replace processors over time. If we do, we will update this list and, where required, notify users.
5. Data retention
- Tour records and qaCount are retained while your account is active so you can revisit your history. They are deleted when you delete your account (see “Your rights”), subject to a short window for backups to expire.
- Voice audio segments are processed in transit and are not stored long-term. Transient buffers may exist briefly in our or our STT/TTS providers' infrastructure while a request is in flight.
- Account information (email, OAuth provider link) is retained until you delete the account.
- Billing records held by Polar.sh follow Polar's retention schedule and any applicable tax-law retention requirements.
- Backups of our database are retained on a rolling basis (typically up to 30 days) after which deletions propagate.
- Logs (request logs, error logs) are retained for a limited period for debugging and security purposes.
6. Your rights
Depending on where you live, you may have rights under laws such as the GDPR (EU/UK), CCPA/CPRA (California), and similar regimes. These can include the right to access your data, correct it, delete it, export it, restrict or object to processing, and withdraw consent.
- Access and export. You can view your tours from your dashboard. To request a full export, contact us.
- Deletion. You can delete your account from Settings → Danger Zone. This removes your account, tour history, and personal profile from our systems (subject to backup-expiry windows).
- Analytics opt-out. Where the product surfaces an analytics opt-out toggle, you can disable PostHog event collection. Browser-level “Do Not Track” signals are also respected where supported.
- Complaints. EU/UK users may lodge a complaint with their local data-protection supervisory authority.
7. Cookies & local storage
We do not use third-party advertising or tracking pixels. The cookies and storage keys we set are limited to the following:
- Session cookie (set by NextAuth) — keeps you signed in across page loads.
anon-tour-idcookie — a randomly generated identifier so anonymous (signed-out) visitors can start a tour and we can attribute it to that browser until they sign in.repotour-themeinlocalStorage— remembers your light/dark/high-contrast theme preference.- PostHog cookies — set by PostHog for product analytics; can be disabled by opting out of analytics.
8. Children
CodebaseLM is not directed to children under 13 (or the equivalent minimum age in your jurisdiction). We do not knowingly collect personal information from children under 13. If you believe a child has provided us with personal information, please contact us and we will delete it.
9. International data transfers
CodebaseLM's primary infrastructure is hosted in cloud regions in the United States. Some of our processors may process data in other regions. If you access the Service from outside the United States, your data will be transferred to and processed in the United States and possibly other countries, where data-protection laws may differ from those in your jurisdiction. Where required, we rely on appropriate transfer mechanisms such as the EU Standard Contractual Clauses.
10. Security
We use industry-standard safeguards to protect your data: encryption in transit (HTTPS/TLS), encryption at rest for our database and storage, OAuth-based authentication, and least-privilege access controls for our team. No system is perfectly secure; if you believe your account has been compromised, please contact us immediately.
11. Changes to this policy
We may update this Privacy Policy from time to time. The “Effective date” at the top of this page reflects the most recent revision. For material changes, if you have an account we will notify you by email and/or in-product before the change takes effect. Your continued use of the Service after the effective date constitutes acceptance of the updated policy.
12. Contact us
For questions about this Privacy Policy, to exercise your rights, or to report a concern, email privacy@codebaselm.ai. You can also reach our general support team at support@codebaselm.ai.
Astrafield LLC, operating as CodebaseLM.