Giuseppe Della Corte

panda{·}etl - Automate your document workflows

Turn messy files into actionable data. Upload PDFs, images, audio and websites. Define data points for AI-powered extraction. See results in exportable spreadsheets with linked, highlighted sources. Ask questions, plot charts and draft reports on top.

Add a comment

Replies

Best
Giuseppe Della Corte
👋 Hey Product Hunt community! We're excited to share something we've been working on at Sinaptik (YC W24). After creating pandas-ai (chat with your tables), we've had countless conversations with data analysts and business experts about their daily struggles. These chats revealed some common frustrations: 1. Valuable insights buried in messy, hard-to-read files 2. The headache of managing documents with different permission settings 3. RAG chatbots that seemed promising but ended up being costly data dumps 4. And the ever-present challenge of bridging the gap between business experts and developers These weren't just abstract problems - we saw how they affected real people trying to do their jobs efficiently. So, we rolled up our sleeves and got to work. We talked to analysts, data scientists, and business users to understand their needs. The result is panda{·}etl, a tool we hope will make life easier for anyone dealing with document-heavy workflows. With panda{·}etl, you can: 1. Upload those tricky files (you know, the PDFs, images, and audio files that usually cause headaches) 2. Define exactly what data you need (whether it's ESG metrics, competitors data, market trends, risk engineering reports, insurance claims) 3. Get spreadsheets where you can actually trace where each piece of data came from 4. Easily validate and export your data 5. And use our pandas-ai powered chat to explore your extractions, plot charts and add them to drafts We've built panda{·}etl with flexibility in mind, offering solutions ranging from SaaS to On-Premise: 1. For individuals, we offer a free personal plan with limited documents processed and extractions per month. It's perfect for trying out the tool and for smaller projects. 2. For businesses, we have scalable plans that grow with your needs, files size and amount of docs. 3. For enterprises, we provide custom solutions, including on-premise deployments for those with specific security or compliance requirements. We're still learning and improving, and that's why we're here. We'd love to hear your thoughts, experiences, or even your data horror stories. How do you deal with unstructured data in your work? What solutions have you tried? Let's chat - we're genuinely curious to learn from this community! 🙌
Alexandros Fokianos
@gdc been following you guys for a long time. as a friend, so proud of seeing you guys evolving the product and test out things As a fellow startupper, I think you're on track to solve huge problems, not only for enterprise companies but also for startups and scaleups that manage big amount of data. Great launch!
Gabriele Venturi
@fok96 thanks a lot mate!
Giuseppe Della Corte
@fok96 excited to have Zefi's team support. Let's rock!
Andrew Nelson
@gdc congrats on the launch, Giuseppe! 🙌 looks like a neat product, looking forward to trying it out 😊
Serge Tim
Congrats! Do you have an API? My use case: I want to build a QA system for my CSV tables with survey results (some cells may contain numbers, and some contain text), and it needs to work within my product
Gabriele Venturi
@s5f5f5f thanks a lot! Yes, we also offer an API that is a perfect match for your use case. Feel free to reach out, would love to learn more about your use case: gabriele@sinaptik.ai
Jose Quan
@gabriele_venturi I am also interested in an API, how can we get in touch?
Max
It's sleeeeeeek! Congrats @gabriele_venturi & team 🔥🔥🔥
Gabriele Venturi
@mxcrbn thanks a lot mate 🔥
Nicole Park
Congrats on the launch, @gabriele_venturi ! I love that there's also an open-source version. It will be useful across a variety of fields in different industries. Wishing your team great success! - I'll definitely give it a try. :)
Gabriele Venturi
@nicolepark thanks a lot for the kind feedback, we’ll do our best to keep going!
Şeyma Alan
Congrats! The product seems very successful. Are you able to extract information from tables in PDFs?
Gabriele Venturi
@seymaalan yes, and soon we are releasing v2 of our pdf parser, which will be even more accurate 😄
Jonathan Viet Pham
Congrats to the panda.etl team! This tool sounds like a fantastic way to simplify turning unstructured data into actionable insights. Looking forward to seeing how it helps streamline data extraction!
Gabriele Venturi
@vietpham thanks a lot! Can’t wait to hear your feedback if you have a chance to test it out!
Nicola Sebastianelli
Very great product. Amazing UI
Gabriele Venturi
@nicola_sebastianelli thanks a lot, and this is only the beginning 🚀
Giuseppe Della Corte
@nicola_sebastianelli thanks! Without effettive UI and UX there is no true data democratization!
James Wilson
This sounds really intriguing, @gdc! I'm curious about the specific types of data points that can be extracted. How customizable is the extraction process for different file types? Also, what kind of support do you offer for users who might be new to data extraction? Would love to know more!
Gabriele Venturi
@james_wilson_ it can extract structured data from any unstructured data (pdf, audio, etc). Your input is a set of structured files and your output is an easy to use excel spreadsheet. It’s quite intuitive also for non tech users, but we also offer an onboarding call!
Giuseppe Della Corte
@james_wilson_ you can try it already on GitHub. We developed it with flexibility, simplicity and accuracy in mind. After you create a project you can add a new extraction process, define fields names, get pre-filled descriptions and data type you can fully modify. Looking forward to hearing your feedback
Emma Schneider
This sounds like a game-changer for data handling! 🦾 It's like you took the pain points of data analysts and turned them into a scalable solution. The ability to upload various file types and get actionable insights is something we seriously need. Can't wait to see how panda{·}etl evolves—very curious about the pricing models for businesses too! Keep those updates coming, @gdc! How's the initial traction looking?
Gabriele Venturi
@wenzhu_zhang1 thanks a lot for the feedback! Initial traction is going great so far! If you want to get more about the business model, feel free to email me at gabriele@sinaptik.ai
Francesco Manicardi
Nice product @gabriele_venturi , can you share a bit more about how it works? Does it extract text from the PDF or does it do OCR? If it extract text strings, how do you deal with tables which can turn out all messed up with newlines and weird formatting?
Gabriele Venturi
@francesco_manicardi great question. It does extract text and it does OCR. We have build a custom parser that for each page identifies the different components of a page (images, text, tables, charts, etc) and parses each individually with the most accurate technique, and parses it to be easier to be understood for LLMs.
Daniel Bukač
I love the design and the idea is solid. How would you distinguish yourself from Deepnote's AI features?
Gabriele Venturi
@daniel_bukac thanks a lot for the feedback! We are going to focus more and more on a no code ux, adding more and more pipelines from the community. In theory the goal is that no matter your technical expertise, everyone can run pipelines on panda etl!
Kyrylo Silin
Hey Giuseppe, How does panda{·}etl handle different languages or document formats that might have inconsistent layouts? For the on-premise deployments, what kind of setup and maintenance is typically required? Congrats on the launch!
Gabriele Venturi
@kyrylosilin we have built our parser that is able to split each pages in one or multiple areas and apply the best techniques to parse the data accordingly! As of the on-prem, it depends a lot on the specific use case. It can be as easy as a docker container for easy use cases, while more complex architectures (terraform, kubernetes, etc) might be needed based on the volumes! If you have any questions about it, drop a message anytime!
Jose Quan
Read in one of the comments that u offer an API… I run a copier dealership with 100s of copier scanners that produce 1000s of PDFs, pretty sure our clients (banks, insurance cos, BPOs, car dealerships, hospitals, etc) would find it useful. How can we get in touch?
Giuseppe Della Corte
@jose_quan1 on our website there is a form where you can book a call
Tony Han
I've tried a few RAG enabled tools, but none of them seem to be effective. Will try this out - looks very promising. I like how it's open source, free (with a limit) and how you can create workflows to automate file processing. Would be cool to see how others are handling files - if they want to share workflows they built! Congrats on the launch @gdc and team!
Gabriele Venturi
@tonyhanded thanks a lot, can’t wait to hear your thoughts 🚀
Giustino
Congrats on the launch @gdc! I'm really looking forward to trying this. Working with several clients in the past years, I can see how much value they could get from more open access to data! 👏
Ahmet Erkan Paşahan
It's a great idea and looks like it'll be very productive. Congrats on the launch!
Gabriele Venturi
@aepasahan thanks a lot, really looking forward to hearing your thoughts!
Daniel Zhang
This is really interesting, I think the summary of the 4 main issue is exactly it with data scientist/analyst. Especially permission setting, it took us a week to just pass out the right credential and permission for different database access. I've checked out your pricing and I was not too sure how much the 500 credits are going to get me through, if it was in terms of number of files, or sizes of files altogether, how much would you say that is ? Again, congratulation Giuseppe !
Gabriele Venturi
@daniel_xpo the pricing is based on characters or pages, whichever is low. The free plan includes at least 1000 pages per month. Thanks a lot for the great feedback. Looking forward to hear more if you give it a try!
Sawana H
Congrats on launching Panda ETL! 🎉The idea of building ETLs without coding sounds like it could save me hours of work each week. By the way, I'm curious about how it handles really large datasets. Does it have any limitations on data volume or processing speed?
Gabriele Venturi
@sawana_h I swear this is only the beginning, stay tuned, we are planning to disrupt the way people do ETLs! It handles large datasets very well, but at the moment it parallelizes up to 3 processes at a time. We're working hard to scale it soon tho!
Abhay Talreja
@qdc, congrats on the new product, mate. panda{·}etl looks super promising, curious to see how it would evolve in the data handling space.
Gabriele Venturi
@abhaytalreja thanks a lot, we’ll keep pushing to make it better and support more and more data pipelines 🚀
Zishan Iqbal
Launching soon!
@gdc Congratulations on the launch! Your platform’s ability to turn various file types into actionable data and highlight sources is impressive. At InterWiz, we're revolutionizing the hiring process with on-demand AI-powered interviews and instant evaluations, ensuring top talent is swiftly identified and onboarded. How do you plan to enhance data point definition for more precise AI-powered extraction?