Everyone's heard of AI these days, even under-rock dwellers. Some experts claim that AI will revolutionize tech, work, and society forever by increasing efficiency and performing routine and mundane tasks that would manually take hours in seconds. While the effects of the "AI Revolution" are only theorized at this stage, one aspect that already has become reality is productivity increases in workplaces thanks to AI tools like Microsoft Copilot. In some cases, integrating an AI tool can be a game-changer, but for the module to be most effective the data needs to be in the right state to take advantage of all the benefits. So what is AI ready data and how can organizations make changes to ensure their data is suitable for AI tools?
In this article, we'll focus on Microsoft Copilot, but many of the principles also apply to other similar tools.
Microsoft Copilot is an AI system that combines large language models (LLMs) with organization data. Integrating with Office 365, Copilot is designed to increase productivity by analyzing content and generating summaries and new material. Microsoft claims it will help users "be more creative in Word, more analytical in Excel, more expressive in PowerPoint, more productive in Outlook and more collaborative in Teams."
Since Copilot was launched in 2023, some organizations have seen a massive boost in productivity, with a Microsoft feedback survey of 297 users indicating that 70% of users thought that Copilot made them more productive. When asked to estimate how much time it saved, it averaged out to about 14 minutes a day, with about a quarter of participants claiming it saved half an hour or more.
There are multiple versions of Copilot in development, each with a different focus. One is optimized for Sales, another for Development, and another for Security. Each has the goal of being highly focused on a specific area instead of providing general assistance. These will have different permissions and access levels, and be able to do different things like summarization, analyzing, and task automation.
For example, Copilot for Sales helps draft emails using information from a connected CRM platform in order to make them specific to the prospect. It can also help create pitch decks, Excel spreadsheets, and PowerPoint presentations to help increase the productivity of the Sales team. At scale, these can help streamline work and still provide personalized assistance to customers and leads, while making sure everything is connected back to the CRM to maximize accuracy.
Over time, Microsoft is planning to release several more of these AI Copilots with new features to make them more useful for organizations. Some planned features are using voice in mobile apps like OneNote, availability in GCC, and more integrations with Microsoft Viva. If your organization handles large quantities of data and you don't see much of a use for AI right now, there's a chance that you will sometime in the future, and making sure you have a strong data foundation now will mean you're ready to integrate AI when you have the right use case.
Making the most out of Copilot depends on having a strong data foundation. A strong data foundation means there is proper naming, storage, and access conventions. Many organizations don’t have data in the best condition, and it’s recommended to do some cleanup before integrating an AI system in order to get the most out of it.
The push for remote work along with the sudden emergence of Covid pushed a lot of organizations to rush into the cloud so employees could continue to work from home, and because this happened so fast sometimes there was no time to analyze or update the existing architecture. As a result, some of the best practices of migrating data to the cloud weren’t followed, and the data wasn’t in an AI-ready state.
If the organization already has a data foundation with proper naming, storage, and access conventions, then integrating Copilot will be relatively simple and straightforward. Without doing this, Copilot will be much less effective. But that's only part of the problem. More significant is that it may expose sensitive information to people who shouldn't have access to it, so it's crucial both for security and efficiency to make sure there is a strong data foundation before integrating Copilot.
Limiting the AI's access to specific data is one way to improve data foundation. One way to do this is to use Groups to avoid oversharing. "Groups" in Microsoft 365 are simply a set of permissions, and adding a user to this group gives them the permission set immediately. Changing something at the Group level applies to all members at once, so it's much easier to manage than individual users. Check that your Groups are up-to-date and add new ones if needed, which can have policies for Copilot included.
Another way to limit access is to restrict Copilot from accessing certain information. This is most straightforward in SharePoint when done at the site level, which will restrict Copilot from accessing certain SharePoint sites, meaning it can't reference or search for information in those sites.
Limiting Copilot requires activating Restricted SharePoint Search, which when active will give a message that some information can't be accessed. This also means an admin will need to set up a list of curated SharePoint sites that are allowed to be accessed. Besides these, Copilot can search the user's OneDrive, emails, and chats, as well as content from frequently accessed sites, which include pinned sites, visited sites, and sites where a file has been shared from. Keep in mind that enabling this will restrict all searches in those sites, not just for Copilot but for users too.
Metadata gives context about the data itself, so making proper use of it will make it easier for the AI model to understand what it's looking at. Different sources identify different purposes for metadata, but they tend to answer four W's (and one H)--Who, what, when, where, and how--so someone quickly looking for info on the data can find the Why.
While an AI like Copilot can offer a lot of advantages, when working with structured data, you need to clearly define your structure. The reason for this is because the AI is looking for certain markers so that it can understand the data. And of course, it needs to understand the data well in order to find, summarize, and transform it accurately.
For example, for Copilot for Power BI to work efficiently, it needs to know the relationships between tables, how measures are named and defined, hierarchies, and more. Microsoft suggests cleaning up semantic models before enabling Copilot for better results.
Sources like Gartner identify six different types of metadata, each of which has a different purpose.
While all of them are important, improving technical metadata will help the data be discoverable, while improving governance metadata will help ensure the right people have access.
Following metadata standards for AI readiness will help the data be useful and discoverable, both for humans and AI.
If your data isn't in an AI-ready state, it doesn't mean you can't use an AI system. The best option is to clean up and restructure data before integrating AI so that information is accurate and sensitive data isn't exposed to the wrong people.
So how do you do this? There are a few ways to get your data ready for AI, such as labeling sensitive data, updating access restrictions, archiving, and migrating data.
Of course, in order to use AI systems at all, you have to be in a cloud storage platform that supports it, such as SharePoint. If your data is currently in an unsupported cloud platform or an on-prem file server, you can use Movebot to migrate your files to the right place. Movebot is the fastest and simplest data migration tool there is, completely SaaS with no infrastructure setup or complex configuration. Simply create an account, connect your storage, and start moving data in minutes.
New accounts also get 50GB of free data to move so you can see how easy it is to move data. And there's no required demos, sales call, or even a credit card. Sign up for an account now to get started.