Tourists Spending in Russia Soars to 1.8 Trillion Rubles in a Year - SberAnalytics

June 17, 2024, 9:42 pm

FinTechService

Location: Russia, Moscow

Employees: 10001+

Founded date: 1841

Total spending by tourists in Russia has increased by 1.8 times over five years, reaching 1.8 trillion rubles in 2023, according to a study by SberAnalytics prepared in June. The analysis used anonymized retrospective data on tourist flows and expenditures. "People are willing to spend more on food, entertainment, and hotels. About a third of all vacation expenses are on accommodation," the study says. The most money spent by tourists was in Moscow, the Moscow region, the Krasnodar region, St. Petersburg, the Leningrad region, and Tatarstan. Analysts found that the largest number of young tourists come from the North Caucasus region. Over five years, from 2019, domestic tourism in Russia has increased by 32%. In the pandemic year of 2020, 92 million citizens vacationed within the country, in 2021 and 2022 - 115 million, and in 2023 - 153 million, setting a record over five years of observation. SberAnalytics expects the trend to be positive in 2024 as well. Photo on the cover: RB.RU Subscribe to our Telegram channel to stay up to date with the latest news and events!

Big Language Models - Race to a Dead End or Breakthrough into the Future?
Returning to the topic of my favorite Big Language Models (LLM, BLM). Observations over the past few months in the industry, events, and dynamics clearly demonstrate movement with increasing acceleration straight into a dead end. The finish could be spectacular. Where do these conclusions come from? Let's break it down. For those actively using BLM in their work, especially if this work involves more than just writing texts, but more serious analytical tasks, coding, they have surely noticed a lack of abstraction and systematic abilities. They constantly tend to get stuck on specifics - a good illustration is debugging code. They handle minor errors well, but if the error is systemic, in the logic of the code, in the data structure, they usually can't handle it. The same goes for analytical tasks - they handle junior tasks well, but more serious levels pose difficulties. Let's note this fact and move on. The biggest drawback of a neural network BLM, in my humble opinion, is that its structure is static. It's like the human brain - a dynamic structure, but the BLM structure was formed initially, with a set number of layers, their width, number of input and output parameters, and nothing can be changed, only trained. Furthermore, during training, conditional "images" and concepts are formed inside the network. Some of them can be compared to known words of the language (which some BLM anatomy enthusiasts successfully do), while others probably have no analogs, as they represent more complex abstractions. But let's note two key parameters of a neural network: width and depth. Depth - the number of layers in a neural network. This parameter determines how well it can abstract. If the model has low-level abstractions at the input - tokens (parts of words, symbols), then in the depth of the model, we already have vector representations of complex concepts. Insufficient depth of the model leads to the problem of not being able to conduct deep systemic analysis, superficiality, which is often encountered in practice and what we talked about at the very beginning. Width - the number of neurons in a hypothetical layer. This parameter determines the number of representations the neural network can operate on at a specific level. The more there are, the more fully they can reflect representations of the real world, which is essentially what BLM is. What happens if the width of a layer is insufficient? It won't be able to fully form the conceptual apparatus of that level of abstraction, resulting in errors, substitution of concepts with similar ones, leading to a loss of accuracy or hallucination. And what happens if the width is excessive? Complexity in forming the conceptual apparatus, its blurriness, and, as a result, a loss of accuracy. But in practice, as I see it, the first option is much more common. The key problem is that we don't know for sure what the width of each specific layer and the depth of the entire model should be. In the living brain, these parameters are dynamic, as they depend on the information received during training: neurons are formed and die, connections change. But the architecture of the artificial neural networks used is static, and the only option is to set the width and depth larger, with a margin. However, no one guarantees that this will be enough at a specific N-th layer. But this gives rise to a number of problems. 1. If increasing the depth of the model linearly affects the number of parameters in it, then increasing the width of a layer has an exponential effect. Therefore, we can see the size of top BLM models exceeding a trillion parameters, but comparing them to models two orders of magnitude smaller in size does not show a significant difference in generation quality. And since the further growth of models is exponential, we can witness industry leaders frantically increasing computational power, building new data centers, and feverishly solving the energy supply issues of these monsters. Moreover, improving the model's quality by a hypothetical 2% requires increasing computational power by an order of magnitude. 2. The unrestrained growth in the number of model parameters requires a huge amount of training data. And it is highly desirable - quality data. This is a big problem. Already, there is a question of artificially generating new training data, as natural sources are running out. Attempting to pump the model with everything at hand creates new problems: a drop in generation quality, biases, etc. 3. During training, all model weights are completely recalculated on each iteration, for each token input. This is catastrophic inefficiency. Imagine if, when reading a book, you had to reread it from the beginning for each subsequent word! (Yes, the comparison is incorrect, but it vividly reflects the scale of the problem). Moreover, when working with BLM, almost all model weights are recalculated for each generated output token. 4. As computational complexity grows, the problem of parallelism arises. The overhead of computational power grows non-linearly with the growth of model sizes. Communication between individual nodes of the cluster introduces delays. Of course, new developments in accelerators with more memory and optimizations partly help solve the problem, but only partly, as the growth of the models themselves is happening at a much faster pace. These are just some of the emerging problems, but the most acute ones. And these problems are quite obvious to those involved in BLM development. So why do they rush headlong into a technological dead end with such persistence, enthusiasm, and increasing acceleration? The answer is quite simple. Undoubtedly, BLM technology has shown its capabilities, and at this technological level, it is quite capable of creating a system close to or even surpassing humans. And whoever does this first, figuratively invents a new atomic bomb, an absolute weapon that will give a new technological impetus, possibly helping to develop a new, more efficient architecture and, as the dead end approaches, make a quantum leap and overcome this potential barrier. Maybe... Or maybe not. And although market leaders are full of optimism, we may witness another financial catastrophe, a new dot-com crash squared, when the hypothetical GPT5 does not meet high expectations, and resources for creating GPT6 are measured not in billions, but in hundreds of billions or trillions of dollars. We were recently amazed by Sam Altman's words when he voiced similar astronomical estimates of the resources he wants to attract. And he knows what he's talking about. But let's get back to reality. We are in Russia, facing technological sanctions. Industry leaders Sber and Yandex are trying to create something with their models, but we see that... well, let's not dwell on the sad. Is there a way out? There is always a way out, sometimes more than one. Perhaps some developments are being made (probably they are), but fundamental things, such as new architectures of neural networks in particular and artificial intelligence systems in general, are not created quickly. And for market leaders, it's a race, a matter of months, they don't have time for new architectures, they squeeze the maximum out of what they have. We definitely won't catch up with them, so we need to go a different way. Let's not consider exotic technologies like quantum computers - that's a thing of the distant future. Sometimes, to come up with something new, you just need to remember something well forgotten. For a long time, the development of AI technologies went in the direction of deterministic models, expert systems, fuzzy logic systems, etc. Against this background stands the technology of semantic networks, where nodes are concepts, and connections are relationships between them (essentially, modern LLMs are semantic networks, only non-deterministic). Add a hierarchical structure to it for abstracting concepts. The structure itself can be made dynamic so that nodes and connections are created during training. Implement the model's training and operation based on agent technologies. Agents move through the network according to given rules and based on internal states, making point changes (training) or gathering information to form a response. The agent approach does not require a complete recalculation of the entire network and parallelizes well, without requiring colossal computational power. That's all, thank you to those who read to the end :) As always, I welcome substantive comments, remarks, and ideas!