Why FormKiQ?

Why FormKiQ - Overview

FormKiQ vs. Off-the-Shelf Software and Less Flexible SaaS When is FormKiQ a better choice than Off-the-Shelf Software and Less Flexible SaaS?
FormKiQ vs. Building It Yourself How does FormKiQ save time over custom in-house solutions?
Use Cases FormKiQ works for small and large workflows, across all verticals and industries.
FormKiQ For Teams Find out how FormKiQ can work for your team
FormKiQ For Industries Discover the advantages FormKiQ can bring to your industry

Use Cases

Blockchain and Decentralized Storage Leverage web3 technologies including proof of work and distributed systems for document control and data privacy
Content and Digital Asset Management Integrate with your preferred web content management system while leveraging FormKiQ for managing your digital assets
Document Management Module Integrate all of the required functionality of a document management system into an existing software solution
Integration with QMS or LIMS Add missing functionality for your Quality Management or Laboratory Information Management by integrating with FormKiQ
Job Application Form Receive applications, including cover letter and resume attachments, and import into an existing HR Management System
Legal Discovery Tool Find information quickly by combining full-text search with AI-powered document classification
The Paperless Office: Digital Document Processing Collect, process, and store paper and digital documents, allowing for archival, integration, and future recall
Product Leasing System Process client lease applications, including a credit check and approval workflow

More Use Cases...

For Teams

Company-Wide Break down the silos in your organization with a centralized control center for documents, ready for integration with any and all systems
Engineering and Product Reduce development time and agony with battle-tested components for your applications
Finance and Accounting Process paper and electronic invoices and receipts, ready for integration with your important systems
HR and Recruiting Build and support your people across the organization by integrating with your essential tools
IT and InfoSec Provision a secure document store with the encryption and controls needed for compliance and protection
Legal Manage and safeguard contracts and other essential documents across systems
Marketing Add better discovery and reliability to digital assets and other essential content, while enabling integration with a web content management system
Sales Keep track of sales assets and contracts inside and outside of your CRM and other tools

More Teams...

For Industries

Accounting, Financial Services, and FinTech Standardize financial documents, metadata, and workflows across systems, teams, auditors, and clients
Education, Training, and EdTech Integrate Learning Management Systems with other essential applications and tools
Healthcare, Life Sciences, and MedTech Combine secure and compliant records management with laboratory information management systems
Law Practices and Legal Services Ensure efficient legal discovery and case management
Logistics and Transportation Provide a robust and customized solution for fleet management or other logistics needs
Manufacturing, Production, and Utilities Control and distribute essential documents and standard operating procedures within and between facilities, partners, and clients
Online Entertainment, Gaming, and Gambling Provide the required compliance documents to partners, customers, and government agencies
Professional and Technical Services Ensure that clients, inspectors, and subcontractors are aligned with consistent document control
Tech Startups Build robust document management functionality into your disruptive product

More Industries...

FormKiQ Blog > The State of EDMS

What We've Learned So Far with Our Journey into AI for Document Management

Our findings from the post-ChatGPT world

by Regan Wolfrom

What We've Learned So Far with Our Journey into AI for Document Management

We often see rocket ship imagery with startups (and we’ve used it a few times ourselves), where the ideal path is up, up, and away, while a just-as-common trajectory is up-down-EXPLODE.

I think of what AI’s imagery should be. It’s not a robot holding a laptop, which is a common result on image searches. Based on what we’ve learned so far at FormKiQ, AI has been like riding a high-speed train high up into the Himalayas.

There are some dips, and plenty of dangerous curves, and eventually AI may climb so high that we may not be able to breathe when we reach the end of the journey.

But that’s not the biggest (or only) takeaway we’ve had since November 30th, 2022, when ChatGPT was released. We’ve learned quite a bit, and while some of it has turned our pre-ChatGPT AI strategy on its head, other parts feel like maybe the train track is still heading in the same direction.

Ownership, Confidentiality, and the Big Data Cow-Catcher

While people are often more impressed than bothered by ChatGPT, for more visual generative AI such as Midjourney, there are obvious concerns about where the model’s training data came from. When a generated image has a distorted version of the Getty Images watermark, it’s not hard to wonder how much of what you’re using is taken from other creators. Hint: basically all of it.

Since Artificial Intelligence is not Artificial General Intelligence, i.e., not an artificial reflection of the human brain, the generated results from any AI model is just a gumbo (or paella) of the data that was used to train it.

Some tech philosophers might claim that, similar to what Picasso said, that good models copy and great models steal, and that it’s no different than what humans themselves are doing as they trace over their favorite artworks or write stirring passages of Melrose Place fan fiction. That may be something that comes in the future, when artificial minds actually think, but for now that’s not a compelling argument for a reddit thread or a court of law.

But beyond the concepts of plagiarism and fair use and credit to creators, there is the issue of security. While specific AI companies may provide assurances that data submitted to an AI API for the purpose of generating things like summaries, translations, etc., many companies are not only reluctant to share proprietary information via an API, there may be regulatory and compliance risks when it comes to personal information of customers and employees.

In other words: does the AI Cowcatcher Catch it All?

This has led to a murky understanding of just which AI is safe for use for confidential information, and which AIs are not, and that means that for most organizations, an overabundance of caution is warranted.

That could mean trying to find models that can be run on personal computers or on-prem data centers, or it could just mean ensuring that the models are used within cloud accounts within that organization’s control.

For most, it means that just paying for access and sending confidential data over an API request to OpenAI or other AI APIs is not a top choice. Instead, the big cloud providers are providing a walled garden approach, where the model is brought into the cloud account and data that is included in AI prompts never leaves the yard.

For FormKiQ, that means that we are looking at offering both OpenAI API using a bring-your-own-key model, while also working on using both Amazon Sagemaker and Amazon Comprehend for more guarded data, using models from providers like Cohere.

One possible workflow would be to use Amazon Comprehend to remove Personal Identifiable Information (PII), at which point data that is not considered proprietary IP could be sent to an external AI API.

Another workflow would be to stay completely within the AWS Account, using Amazon Sagemaker with a preloaded AI model.

A third workflow would be to refine models and/or train new models, all within a cloud account under the customer’s control. An example of this, i.e., why you’d bother training your own model, would be to create a customized document classification model based on a set variety of documents that are highly specific to the organization’s workflows, where a more generic document classification model would not return results that are granular enough.

In essence, your actual AI workflow will depend on your specific needs, and it’s important for platforms like FormKiQ (and AWS) to provide enough flexibility to meet those needs.

The Open Source vs. Commercial Debate Rages On

Open Source vs. Commercial AI Models: the Age-Old Public/Private Infrastructure Debate

We recently completed a project with CANARIE, the Canadian Network for the Advancement of Research, Industry, and Education, that looked at document classification built entirely with free and open source components. The end result is a DAIR BoosterPack, a free, curated package of resources on a specific emerging technology, and in our case, it leverages what was the current state of Open Source AI as of Q1 2023.

NOTE: we are presenting a webinar on Wednesday, July 26th, 2023 at 12pm EDT / 9am PDT that will walk through this FormKiQ Automated Document Classification and Discovery BoosterPack. If that sounds interesting, you can register to attend.

What we learned is that at the time, the open source models could not compete with OpenAI. The results were inconsistent in quality, with the same prompt producing wildly different responses of wildly different accuracy.

Here are the models we used:

for Generating Useful File Names: T5 One Line Summary
for Named Entity Recognition: bert-large-NER
For Determining Document Types: Document Image Transformer (base-sized model)

One interesting situation came from using the Document Image Transformer from Microsoft; it was really good at determining the document type, assuming the document type was within the specific document types that were included in the training (some academic, some business-oriented), but oftentimes documents in other areas, such as legal documents, were classified as some form of scientific literature.

What is convenient about OpenAI’s large language models is that they can perform many different kinds of tasks, with a very robust response to prompts. For instance, FormKiQ’s Document Tagging action can not only ask OpenAI to determine specified tags based on document content, it can also specify the return format and even the specific notation used for key names, e.g., we can ask for keys to be named using camel case (“camelCase”) or we can ask for snake case (“snake_case”).

That was the case in Q1 of 2023. I don’t expect this to be the same finding we’d make if we tried this all over again in Q1 of 2024. As the technology around models and transformers advances and the cost of creating new models declines (which is likely despite the increased demand for GPUs), we do expect that open source models will continually advance to near-parity with commercial models.

I say near-parity, because it’s not yet clear that open source can meet or exceed the models created by OpenAI and Google. It’s definitely possible, but for now, we’re hedging our bets and assuming that both commercial and open source models will be key components for AI strategies for a good while.

Efficiencies of Scale: Most of these Rail Cars are Headed in the Same Direction

Ultimately, while there will be variations in AI strategy across industries, geographies, and business strategies, the result will be a small amount of variety and flexibility to connect these pieces together.

As platforms like FormKiQ develop new components and the API infrastructure to access those components, I believe the end result will be solutions where organizations will have the ability to choose specific AI items from the buffet table.

It will be the platforms themselves who will be in charge of tracking new developments and providing their best tooling and counsel to their customers, who can then grab a new plate and sample the latest offerings, knowing that their data will be safeguarded as required.

For more information on how you can leverage these new AI components, contact us or schedule a consultation call.

Why FormKiQ?

Why FormKiQ - Overview

Use Cases

For Teams

For Industries

Resources

Learning Center

What We've Learned So Far with Our Journey into AI for Document Management

Our findings from the post-ChatGPT world

What We've Learned So Far with Our Journey into AI for Document Management

Ownership, Confidentiality, and the Big Data Cow-Catcher

Open Source vs. Commercial AI Models: the Age-Old Public/Private Infrastructure Debate

Efficiencies of Scale: Most of these Rail Cars are Headed in the Same Direction

Try FormKiQ Core today

Get Started with FormKiQ Pro or Enterprise