Utilizing Generative AI in Sharing Internal Know-How and Know-Who
To reach a broader audience, this article has been translated from Japanese.
You can find the original version here.
This is the article for Day 24 of the Mamezou Developer Site Advent Calendar 2024.
Hello, everyone. I'm Muta, completely captivated by the charm of Cosense (formerly Scrapbox)[1]. Today is Christmas Eve. Before we know it, 2024 is almost over.
This time, I will introduce the utilization of generative AI in sharing internal know-how and know-who within our company.
Utilizing Cosense for Sharing Know-How and Know-Who
#It's been five years since we officially adopted Cosense as a tool within our company in July 2019. Cosense has become a tool that all employees use almost every day and has established itself as an internal portal.
Please refer to the following article for the issues before introducing Cosense, how we resolved them using Cosense, and the effects.
Current Issues in Utilizing Cosense
#As of December 2024, the total number of pages exceeds 14,000, and new information continues to increase. On the other hand, due to the excessive amount of information, searchability[2] and ease of information utilization have become issues.
So, aiming to gain knowledge about generative AI (OpenAI API), we decided to tackle this issue through dogfooding to see if we can resolve it.
Mechanism of Information Retrieval Using Generative AI
#Roughly speaking, we realized it in the form of the diagram below.
The flow from information input to retrieval is as follows:
- Information Input
- Information Retrieval (Narrowing Down)
- Information Transfer & Instruction to Generate RAG
- Question Input
- Request
- Response
- Answer Display
Below, I will explain each in detail.
(1) Information Input
#Information input is simply carried out on Cosense. It's really simple. As mentioned earlier, we already have a sufficient amount of high-quality information accumulated.
(2) Information Retrieval (Narrowing Down)
#GitHub Actions regularly automatically retrieves the information input into Cosense in (1), and narrows it down after excluding pages containing specific links[3].
(3) Information Transfer & Instruction to Generate RAG
#Next, GitHub Actions sends the information retrieved in (2) to the OpenAI API and instructs it to update the vector database[4].
(4) Question Input
#Information retrieval is done using our internal communication tool, Slack. Specifically, we ask questions to the Slack App (@mame-kun).
(5) Request
#Slack sends a request (question) to the OpenAI API.
(6) Response
#The OpenAI API returns a response (answer) to Slack.
(7) Answer Display
#The answer is displayed on Slack as a reply to the question message thread.
Example of Actual Operation
#Below is an example of actually asking a question on Slack.
You can see that the answer to the question is displayed as a message within the thread, cleverly combining information retrieved from Cosense and general information. The information source from Cosense is also displayed, and you can directly access the relevant page from this link.
Furthermore, if you ask additional questions on the thread, it will answer while understanding the context of the initial question.
It's like having a conversation with an expert who knows internal affairs well.
Design Points
#The design points of this initiative are as follows.
-
Utilization of OpenAI's Assistant API
- Prepare two types of characters with different personalities
- mame-kun: a character that has grasped Cosense's information and answers questions in a frank manner
- mameka: a character that answers general questions positively and cheerfully
- Synchronize the threads on Slack with the conversation threads on OpenAI to provide natural responses in line with the conversation context
- Prepare two types of characters with different personalities
-
Knowledge Base (Cosense) Access
- Efficient information retrieval using OpenAI's File Search
- Hybrid method of semantic search and keyword search
- Cosense's information is regularly imported and updated
- Prohibit access from external users as it is internal information
- Determine based on Slack user information (suppressed if accessed via GitHub)
- Efficient information retrieval using OpenAI's File Search
-
Web Search / Browsing
- Implemented Perplexity search and browsing functions to obtain the latest information and suppress hallucinations
- Not just simple URL fetching, but actual browser operations using Playwright, supporting JavaScript-based websites (SPA, etc.)
- Refer to another article for this ingenuity
- Not just simple URL fetching, but actual browser operations using Playwright, supporting JavaScript-based websites (SPA, etc.)
- Perplexity search uses the Perplexity API
- Use Function Calling and delegate the actual usage judgment to the AI assistant
- Implemented Perplexity search and browsing functions to obtain the latest information and suppress hallucinations
-
Multimodal
- Realize image input by linking Slack's attached images with OpenAI API's Storage service
- Use OpenAI's Vision
- Voice input is currently not supported
- Realize image input by linking Slack's attached images with OpenAI API's Storage service
Effects of This Initiative
#As a result of this initiative, not only was the use of generative AI within the company promoted and understanding of AI technology deepened, but also the following effects were observed.
- Discovery of Unexpected Information (Joy)
- Through question-and-answer with generative AI, we became able to extract Cosense's information more efficiently. And above all, in the process of extracting that information, we experienced the joy of discovering unexpected information and connections between pieces of information.
- Realizing that the know-how they write is useful to someone
- As each employee experiences the above effects, they become more actively involved in accumulating internal know-how and know-who (i.e., writing to Cosense), leading to further motivation for sharing know-how and know-who. This is creating a virtuous cycle that connects information input and output.
Conclusion
#How was it?
I feel that generative AI is still mainly used for improving work efficiency, such as increasing development productivity. Through this initiative, we confirmed that generative AI can contribute to subtle changes in organizational culture.
I hope this article will provide hints for your future utilization of generative AI.
Scrapbox was renamed to Cosense. (
Personally, I was fond of the old name and wished they didn't change it.) ↩︎Cosense provides convenient search functions such as "QuickSearch," "Related Pages," "2 hop search," and "Full-text Search." However, when the amount of information reaches over 10,000 pages, it becomes difficult to find the desired page. ↩︎
Confidential information with links such as clients or projects is excluded from the information retrieval targets. ↩︎
RAG stands for Retrieve and Generate, which is an AI model that simultaneously retrieves and generates information. Note that in OpenAI's documentation, the term RAG is not used; the knowledge base is referred to as VectorStore, and searches using it are called File Search. ↩︎