Hello, recently there has been a growing demand nationwide for performing computations involving quantum computing and generative AI. However, the setup required for generative AI and quantum computing is often unclear, and setting up the necessary networks and machines requires some knowledge and experience.
While it is important to have computational resources available anywhere in the country, I do not recommend placing such machines in residential or office spaces. The reason is that the noise and sound can be quite significant. Currently, computational machines often utilize GPUs, each consuming around 300W of power. Therefore, placing multiple computational machines is akin to having heaters running all year round. Moreover, as performance increases, the heat density also increases, resulting in larger cooling mechanisms, which in turn cause noise and other issues.
When it is challenging to place such equipment in residential spaces, servers are often shoved into empty spaces like warehouses, where they become covered in dust and generate significant heat. However, modern computing nodes emit much more heat than older network machines, and without proper cooling, the machines can fail. In today's highly insulated offices and homes, it is extremely difficult to set up such environments.
Therefore, to address this issue more easily, we have developed independent server rooms using facilities like containers, prefabricated buildings, and sheds for computing nodes. While there are various issues with placing them, they can be resolved as long as power and network connections are available. For power, you will need to ensure a stable supply by somehow bringing in electricity. The network is also important, but the situation is slightly different for generative AI and quantum computing.
For typical networks and web servers, responding to external requests is crucial. Websites frequently change pages, requiring processing each time, making communication important. (Although client-side technologies like SPAs are advancing, the frequency of API calls is still high.)
In generative AI and quantum computing, depending on the use case, computation is the main focus, and processing for a single calculation typically takes between 10 seconds to several hours. As a result, the frequency of requests decreases, meaning that handling about two requests, one when submitting the computation and one when receiving the results, within a span of several tens of seconds is sufficient. Of course, there are challenges, as AI and quantum computing can involve substantial communication for data uploads and downloads. However, if only inference is performed without training, the communication volume is significantly reduced.
In generative AI, the primary operations involve receiving text and returning text, or receiving images and text and returning images, which involve relatively low data volumes and frequencies for processing that typically takes several tens of seconds. The same applies to quantum computing, where quantum circuits or network graphs are received and the computation results are returned. The frequency of computation is quite low, and the data volume is also minimal.
This time, we operated these generative AI and quantum computing servers using 5G cellular networks. In the future, we plan to conduct trials using satellite connections via Starlink as a backup. As long as there are modules such as routers compatible with cellular networks, this can be achieved anywhere within the coverage of cellular networks across Japan.
This time, we obtained a global IP and operated it using DDNS. Currently, we are able to provide stable services, which has allowed for more flexible utilization of computational resources within the company and a successful transition from the cloud. Moving forward, we plan to make the entire system redundant and advance towards distributed management. Additionally, we aim to promote the use of computational resources via satellites using Starlink.