Tf–idf

In information retrieval, tf–idf (term frequency–inverse document frequency, TFIDF, TFIDF, TF–IDF, or Tf–idf) is a measure of importance of a word to a document in a collection or corpus, adjusted for the fact that some words appear more frequently in general. Like the bag-of-words model, it models a document as a multiset of words, without word order. It is a refinement over the simple bag-of-words model, by allowing the weight of words to depend on the rest of the corpus. It was often used as a weighting factor in searches of information retrieval, text mining, and user modeling. A survey conducted in 2015 showed that 83% of text-based recommender systems in digital libraries used tf–idf. Variations of the tf–idf weighting scheme were often used by search engines as a central tool in scoring and ranking a document's relevance given a user query. One of the simplest ranking functions is computed by summing the tf–idf for each query term; many more sophisticated ranking functions are variants of this simple model. == Motivations == Karen Spärck Jones (1972) conceived a statistical interpretation of term-specificity called Inverse Document Frequency (idf), which became a cornerstone of term weighting: The specificity of a term can be quantified as an inverse function of the number of documents in which it occurs.For example, the df (document frequency) and idf for some words in Shakespeare's 37 plays might be represented as follows: We see that "Romeo", "Falstaff", and "salad" appears in very few plays, so seeing these words, one could get a good idea as to which play it might be. In contrast, "good" and "sweet" appears in every play and are completely uninformative as to which play it is. == Definition == The tf–idf is the product of two statistics, term frequency and inverse document frequency. There are various ways for determining the exact values of both statistics. A formula that aims to define the importance of a keyword or phrase within a document or a web page. === Term frequency === Term frequency, tf(t,d), is the relative frequency of term t within document d, t f ( t , d ) = f t , d ∑ t ′ ∈ d f t ′ , d {\displaystyle \mathrm {tf} (t,d)={\frac {f_{t,d}}{\sum _{t'\in d}{f_{t',d}}}}} , where ft,d is the raw count of a term in a document, i.e., the number of times that term t occurs in document d. Note the denominator is simply the total number of terms in document d (counting each occurrence of the same term separately). There are various other ways to define term frequency: the raw count itself: tf(t,d) = ft,d Boolean "frequencies": tf(t,d) = 1 if t occurs in d and 0 otherwise; logarithmically scaled frequency: tf(t,d) = log (1 + ft,d); augmented frequency, to prevent a bias towards longer documents, e.g. raw frequency divided by the raw frequency of the most frequently occurring term in the document: t f ( t , d ) = 0.5 + 0.5 ⋅ f t , d max { f t ′ , d : t ′ ∈ d } {\displaystyle \mathrm {tf} (t,d)=0.5+0.5\cdot {\frac {f_{t,d}}{\max\{f_{t',d}:t'\in d\}}}} === Inverse document frequency === The inverse document frequency is a measure of how much information the word provides, i.e., how common or rare it is across all documents. It is the logarithmically scaled inverse fraction of the documents that contain the word (obtained by dividing the total number of documents by the number of documents containing the term, and then taking the logarithm of that quotient): i d f ( t , D ) = log ⁡ N n t {\displaystyle \mathrm {idf} (t,D)=\log {\frac {N}{n_{t}}}} with D {\displaystyle D} : is the set of all documents in the corpus N = | D | {\displaystyle N={|D|}} : total number of documents in the corpus n t = | { d ∈ D : t ∈ d } | {\displaystyle n_{t}=|\{d\in D:t\in d\}|} : number of documents where the term t {\displaystyle t} appears (i.e., t f ( t , d ) ≠ 0 {\displaystyle \mathrm {tf} (t,d)\neq 0} ). If the term is not in the corpus, this will lead to a division-by-zero. It is therefore common to adjust the numerator to 1 + N {\displaystyle 1+N} and the denominator to 1 + | { d ∈ D : t ∈ d } | {\displaystyle 1+|\{d\in D:t\in d\}|} . === Term frequency–inverse document frequency === Then tf–idf is calculated as t f i d f ( t , d , D ) = t f ( t , d ) ⋅ i d f ( t , D ) {\displaystyle \mathrm {tfidf} (t,d,D)=\mathrm {tf} (t,d)\cdot \mathrm {idf} (t,D)} A high weight in tf–idf is reached by a high term frequency (in the given document) and a low document frequency of the term in the whole collection of documents; the weights hence tend to filter out common terms. Since the ratio inside the idf's log function is always greater than or equal to 1, the value of idf (and tf–idf) is greater than or equal to 0. As a term appears in more documents, the ratio inside the logarithm approaches 1, bringing the idf and tf–idf closer to 0. == Justification of idf == Idf was introduced as "term specificity" by Karen Spärck Jones in a 1972 paper. Although it has worked well as a heuristic, its theoretical foundations have been troublesome for at least three decades afterward, with many researchers trying to find information theoretic justifications for it. Spärck Jones's own explanation did not propose much theory, aside from a connection to Zipf's law. Attempts have been made to put idf on a probabilistic footing, by estimating the probability that a given document d contains a term t as the relative document frequency, P ( t | D ) = | { d ∈ D : t ∈ d } | N , {\displaystyle P(t|D)={\frac {|\{d\in D:t\in d\}|}{N}},} so that we can define idf as i d f = − log ⁡ P ( t | D ) = log ⁡ 1 P ( t | D ) = log ⁡ N | { d ∈ D : t ∈ d } | {\displaystyle {\begin{aligned}\mathrm {idf} &=-\log P(t|D)\\&=\log {\frac {1}{P(t|D)}}\\&=\log {\frac {N}{|\{d\in D:t\in d\}|}}\end{aligned}}} Namely, the inverse document frequency is the logarithm of "inverse" relative document frequency. This probabilistic interpretation in turn takes the same form as that of self-information. However, applying such information-theoretic notions to problems in information retrieval leads to problems when trying to define the appropriate event spaces for the required probability distributions: not only documents need to be taken into account, but also queries and terms. == Link with information theory == Both term frequency and inverse document frequency can be formulated in terms of information theory; it helps to understand why their product has a meaning in terms of joint informational content of a document. A characteristic assumption about the distribution p ( d , t ) {\displaystyle p(d,t)} is that: p ( d | t ) = 1 | { d ∈ D : t ∈ d } | {\displaystyle p(d|t)={\frac {1}{|\{d\in D:t\in d\}|}}} This assumption and its implications, according to Aizawa: "represent the heuristic that tf–idf employs." The conditional entropy of a "randomly chosen" document in the corpus D {\displaystyle D} , conditional to the fact it contains a specific term t {\displaystyle t} (and assuming that all documents have equal probability to be chosen) is: H ( D | T = t ) = − ∑ d p d | t log ⁡ p d | t = − log ⁡ 1 | { d ∈ D : t ∈ d } | = log ⁡ | { d ∈ D : t ∈ d } | | D | + log ⁡ | D | = − i d f ( t ) + log ⁡ | D | {\displaystyle H({\cal {D}}|{\cal {T}}=t)=-\sum _{d}p_{d|t}\log p_{d|t}=-\log {\frac {1}{|\{d\in D:t\in d\}|}}=\log {\frac {|\{d\in D:t\in d\}|}{|D|}}+\log |D|=-\mathrm {idf} (t)+\log |D|} In terms of notation, D {\displaystyle {\cal {D}}} and T {\displaystyle {\cal {T}}} are "random variables" corresponding to respectively draw a document or a term. The mutual information can be expressed as M ( T ; D ) = H ( D ) − H ( D | T ) = ∑ t p t ⋅ ( H ( D ) − H ( D | W = t ) ) = ∑ t p t ⋅ i d f ( t ) {\displaystyle M({\cal {T}};{\cal {D}})=H({\cal {D}})-H({\cal {D}}|{\cal {T}})=\sum _{t}p_{t}\cdot (H({\cal {D}})-H({\cal {D}}|W=t))=\sum _{t}p_{t}\cdot \mathrm {idf} (t)} The last step is to expand p t {\displaystyle p_{t}} , the unconditional probability to draw a term, with respect to the (random) choice of a document, to obtain: M ( T ; D ) = ∑ t , d p t | d ⋅ p d ⋅ i d f ( t ) = ∑ t , d t f ( t , d ) ⋅ 1 | D | ⋅ i d f ( t ) = 1 | D | ∑ t , d t f ( t , d ) ⋅ i d f ( t ) . {\displaystyle M({\cal {T}};{\cal {D}})=\sum _{t,d}p_{t|d}\cdot p_{d}\cdot \mathrm {idf} (t)=\sum _{t,d}\mathrm {tf} (t,d)\cdot {\frac {1}{|D|}}\cdot \mathrm {idf} (t)={\frac {1}{|D|}}\sum _{t,d}\mathrm {tf} (t,d)\cdot \mathrm {idf} (t).} This expression shows that summing the Tf–idf of all possible terms and documents recovers the mutual information between documents and term taking into account all the specificities of their joint distribution. Each Tf–idf hence carries the "bit of information" attached to a term x document pair. == Link with statistical theory == Tf–idf is closely related to the negative logarithmically transformed p-value from a one-tailed formulation of Fisher's exact test when the underlying corpus documents satisfy certain idealized assumptions. More recently, tf–idf variants were shown to arise as components in the test st

MultiValue database

A MultiValue database is a type of NoSQL and multidimensional database. It is typically considered synonymous with PICK, a database originally developed as the Pick operating system. MultiValue databases include commercial products from Rocket Software, Revelation, InterSystems, Northgate Information Solutions, ONgroup, and other companies. These databases differ from a relational database in that they have features that support and encourage the use of attributes which can take a list of values, rather than all attributes being single-valued. They are often categorized with MUMPS within the category of post-relational databases, although the data model actually pre-dates the relational model. Unlike SQL-DBMS tools, most MultiValue databases can be accessed both with or without SQL. == History == Don Nelson designed the MultiValue data model in the early to mid-1960s. Dick Pick, a developer at TRW, worked on the first implementation of this model for the US Army in 1965. Pick considered the software to be in the public domain because it was written for the military, this was but the first dispute regarding MultiValue databases that was addressed by the courts. Ken Simms wrote DataBASIC, sometimes known as S-BASIC, in the mid-1970s. It was based on Dartmouth BASIC, but had enhanced features for data management. Simms played a lot of Star Trek (a text-based early computer game originally written in Dartmouth BASIC) while developing the language, to ensure that DataBASIC functioned to his satisfaction. Three of the implementations of MultiValue - PICK version R77, Microdata Reality 3.x, and Prime Information 1.0 - were very similar. In spite of attempts to standardize, particularly by International Spectrum and the Spectrum Manufacturers Association, who designed a logo for all to use, there are no standards across MultiValue implementations. Subsequently, these flavors diverged, although with some cross-over. These streams of MultiValue database development could be classified as one stemming from PICK R83, one from Microdata Reality, and one from Prime Information. Because of the differences, some implementations have provisions for supporting several flavors of the languages. An attempt to document the similarities and differences can be found at the Post-Relational Database Reference (PRDB). One reasonable hypothesis for this data model lasting 50 years, with new database implementations of the model even in the 21st century is that it provides inexpensive database solutions. == Data model example == In a MultiValue database system: a database or schema is called an "account" a table or collection is called a "file" a column or field is called a field or an "attribute", which is composed of "multi-value attributes" and "sub-value attributes" to store multiple values in the same attribute. a row or document is called a "record" or "item" Data is stored using two separate files: a "file" to store raw data and a "dictionary" to store the format for displaying the raw data. For example, assume there's a file (table) called "PERSON". In this file, there is an attribute called "eMailAddress". The eMailAddress field can store a variable number of email address values in a single record. The list [[email protected], [email protected], [email protected]] can be stored and accessed via a single query when accessing the associated record. Achieving the same (one-to-many) relationship within a traditional relational database system would include creating an additional table to store the variable number of email addresses associated with a single "PERSON" record. However, modern relational database systems support this multi-value data model too. For example, in PostgreSQL, a column can be an array of any base type. == MultiValue Basic Language == Multivalue Basic (now commonly styled as mvBasic) is a family of programming languages more or less common (and portable) to all the multivalue databases derived from the original Pick Operating System. The variations between implementations are known as flavours. The language originates from Dartmouth Basic and the earliest implementation of PickBASIC (now D3 FlashBasic). Over time various customisations and extensions have been added to take advantage of capabilities added to the different flavours while staying mainly in sync. mvBasic statements and functions are designed to access and take advantage of the multivalue database model and providing the usual capabilities of most modern languages. For example, cryptography and communications. mvBasic is typeless and lends itself to structured programming techniques. Example code is available but limited. Whilst there are commercial applications and tools available, the multivalue database community has not embraced the open source library/package model to the degree seen with other languages. The typical mvBasic compiler compiles program source to a P-code executable object and runs in an interpreter, with D3 FlashBasic and jBASE being notable exceptions. == MultiValue Query Language == Known as ENGLISH, ACCESS, AQL, UniQuery, Retrieve, CMQL, and by many other names over the years, corresponding to the different MultiValue implementations, the MultiValue query language differs from SQL in several respects. Each query is issued against a single dictionary within the schema, which could be understood as a virtual file or a portal to the database through which to view the data. LIST PEOPLE LAST_NAME FIRST_NAME EMAIL_ADDRESSES WITH LAST_NAME LIKE "Van..." The above statement would list all e-mail addresses for each person whose last name starts with "Van". A single entry would be output for each person, with multiple lines showing the multiple e-mail addresses (without repeating other data about the person).

Bioelectronics

Bioelectronics is a field of research in the convergence of biology and electronics. == Definitions == At the first C.E.C. Workshop, in Brussels in November 1991, bioelectronics was defined as 'the use of biological materials and biological architectures for information processing systems and new devices'. Bioelectronics, specifically bio-molecular electronics, were described as 'the research and development of bio-inspired (i.e. self-assembly) inorganic and organic materials and of bio-inspired (i.e. massive parallelism) hardware architectures for the implementation of new information processing systems, sensors and actuators, and for molecular manufacturing down to the atomic scale'. The National Institute of Standards and Technology (NIST), an agency of the United States Department of Commerce, defined bioelectronics in a 2009 report as "the discipline resulting from the convergence of biology and electronics". Sources for information about the field include the Institute of Electrical and Electronics Engineers (IEEE) with its Elsevier journal Biosensors and Bioelectronics published since 1990. The journal describes the scope of bioelectronics as seeking to : "... exploit biology in conjunction with electronics in a wider context encompassing, for example, biological fuel cells, bionics and biomaterials for information processing, information storage, electronic components and actuators. A key aspect is the interface between biological materials and micro and nano-electronics." == History == The first known study of bioelectronics took place in the 18th century when Italian physician-scientist Luigi Galvani applied a voltage to a pair of detached frog legs. The legs moved, sparking the genesis of bioelectronics. Electronics technology has been applied to biology and medicine since the pacemaker was invented and with the medical imaging industry. In 2009, a survey of publications using the term in title or abstract suggested that the center of activity was in Europe (43 percent), followed by Asia (23 percent) and the United States (20 percent). == Materials == Organic bioelectronics is the application of organic electronic material to the field of bioelectronics. Organic materials (i.e. containing carbon) show great promise when it comes to interfacing with biological systems. Current applications focus around neuroscience and infection. Conducting polymer coatings, an organic electronic material, shows massive improvement in the technology of materials. It was the most sophisticated form of electrical stimulation. It improved the impedance of electrodes in electrical stimulation, resulting in better recordings and reducing "harmful electrochemical side reactions." Organic Electrochemical Transistors (OECT) were invented in 1984 by Mark Wrighton and colleagues, which had the ability to transport ions. This improved signal-to-noise ratio and gives for low measured impedance. The Organic Electronic Ion Pump (OEIP), a device that could be used to target specific body parts and organs to adhere medicine, was created by Magnuss Berggren. As one of the few materials well established in CMOS technology, titanium nitride (TiN) turned out as exceptionally stable and well suited for electrode applications in medical implants. == Significant applications == Bioelectronics is used to help improve the lives of people with disabilities and diseases. For example, the glucose monitor is a portable device that allows diabetic patients to control and measure their blood sugar levels. Electrical stimulation used to treat patients with epilepsy, chronic pain, Parkinson's, deafness, Essential Tremor and blindness. Magnuss Berggren and colleagues created a variation of his OEIP, the first bioelectronic implant device that was used in a living, free animal for therapeutic reasons. It transmitted electric currents into GABA, an acid. A lack of GABA in the body is a factor in chronic pain. GABA would then be dispersed properly to the damaged nerves, acting as a painkiller. Vagus Nerve Stimulation (VNS) is used to activate the Cholinergic Anti-inflammatory Pathway (CAP) in the vagus nerve, ending in reduced inflammation in patients with diseases like arthritis. Since patients with depression and epilepsy are more vulnerable to having a closed CAP, VNS can aid them as well. At the same time, not all the systems that have electronics used to help improving the lives of people are necessarily bioelectronic devices, but only those which involve an intimate and directly interface of electronics and biological systems. Bioelectronics could be used to develop new label-free methods for monitoring cancer cell invasion and drug resistance. For example, the electrical resistance of cancer cells could be used to predict the effectiveness of cancer drugs and to identify drugs that are most likely to be effective against a particular type of cancer. === Human tissue regeneration === Human tissue, like most tissue in multicellular life, is known to be capable of regeneration. While tissue such as skin and even large organs such as the liver have been shown significant capacity for regeneration much of the adult body is thought to possess limited natural regenerative ability. Research in the field of regenerative medicine has identified that developmental bioelectricity can be used to stimulate and modify tissue growth beyond what naturally occurs with efforts to demonstrate its feasibility in mammals underway. Some researchers believe that future advancements could allow for the regeneration of organs or even entire limbs using bioelectronic devices providing the correct signals. == Future == The improvement of standards and tools to monitor the state of cells at subcellular resolutions is lacking funding and employment. This is a problem because advances in other fields of science are beginning to analyze large cell populations, increasing the need for a device that can monitor cells at such a level of sight. Cells cannot be used in many ways other than their main purpose, like detecting harmful substances. Merging this science with forms of nanotechnology could result in incredibly accurate detection methods. The preserving of human lives like protecting against bioterrorism is the biggest area of work being done in bioelectronics. Governments are starting to demand devices and materials that detect chemical and biological threats. The more the size of the devices decrease, there will be an increase in performance and capabilities.

Honeywell JetWave

Honeywell's JetWave is a piece of satellite communications hardware produced by Honeywell that enables global in-flight internet connectivity. Its connectivity is provided using Inmarsat’s GX Aviation network. The JetWave platform is used in business and general aviation, as well as defense and commercial airline users. == History == In 2012, Honeywell announced it would provide Inmarsat with the hardware for its GX Ka-band in-flight connectivity network. The Ka-band (pronounced either "kay-ay band" or "ka band") is a portion of the microwave part of the electromagnetic spectrum defined as frequencies in the range 27.5 to 31 gigahertz (GHz). In satellite communications, the Ka-band allows higher bandwidth communication. In 2017, after five years and more than 180 flight hours and testing, JetWave was launched as part of GX Aviation with Lufthansa Group. Honeywell’s JetWave was the exclusive terminal hardware option for the Inmarsat GX Aviation network; however, the exclusivity clause in that contract has expired. In July 2019, the United States Air Force selected Honeywell’s JetWave satcom system for 70 of its C-17 Globemaster III cargo planes. In December 2019, it was reported that six AirAsia aircraft had been fitted with Inmarsat’s GX Aviation Ka-band connectivity system and is slated to be implemented fleetwide across AirAsia’s Airbus A320 and A330 models in 2020, requiring installation of JetWave atop AirAsia’s fuselages. Today, Honeywell’s JetWave hardware is installed on over 1,000 aircraft worldwide. In August 2021, the Civil Aviation Administration of China approved a validation of Honeywell’s MCS-8420 JetWave satellite connectivity system for Airbus 320 aircraft. In December 2021, Honeywell, SES, and Hughes Network Systems demonstrated multi-orbit high-speed airborne connectivity for military customers using Honeywell’s JetWave MCX terminal with a Hughes HM-series modem, and SES satellites in both medium Earth orbit (MEO) and geostationary orbit (GEO). The tests achieved full duplex data rates of more than 40 megabits per second via a number of SES' (GEO) satellites including GovSat-1, and the high-throughput, low-latency O3b MEO satellite constellation, with connections moving between GEO/MEO links in under 30 sec. == Uses == === Commercial aviation === Honeywell’s JetWave enables air transport and regional aircraft to connect to Inmarsat’s GX Aviation network. The multichannel satellite (MSC) JetWave terminals share the same antenna controller, modem and router hardware with the business market, but have an MCS-8200 fuselage-mounted antenna. === Business aviation === Honeywell’s JetWave hardware allows users to connect to Inmarsat’s Jet ConneX, a business aviation broadband connectivity offering to provide Wi-Fi for connected devices. JetWave offers a tail-mount antenna for business jets. === Defense === Honeywell’s JetWave satellite communications system for defense allows users to connect to the Inmarsat GX network, offering global coverage for military airborne operators, including over water, over nontraditional flight paths and in remote areas. JetWave and the Inmarsat GX network enable mission-critical applications like real-time weather; videoconferencing; large file transfers; encryption capabilities; in-flight briefings; intelligence, surveillance, and reconnaissance video; and secure communications. JetWave is configurable for a variety of military platforms and offers antennas for large and small airframes.

Hoopla (digital media service)

Hoopla Digital is a web and mobile streaming platform launched in 2013 that provides access to a wide range of digital media, including audiobooks, eBooks, comics, manga, music, movies, and TV shows. The service is available to users through participating public libraries, allowing library cardholders to borrow and stream digital media. Hoopla is a division of Midwest Tape. == History == Hoopla was launched in 2013. Its goal was for libraries to provide patrons with access to digital content such as audiobooks, music, movies, and TV shows, without the need for holds or waiting lists. Hoopla's model is a pay-per-use system, which means patrons can borrow items instantly. Since its inception, the service has expanded its offerings to include eBooks and comics. The app was built exclusively for public libraries and their patrons. Hoopla Digital is the only platform that combines all formats and all license models into one convenient app with no platform fees. In 2017, Hoopla became available on Apple TV, Amazon Fire TV, Android TV, and Roku, allowing users to stream content on larger screens. In 2020, Hoopla Flex and Bonus Borrows programs are introduced, enabling libraries to move their one copy/one user titles. At that time, there were 6.5 million library card holders and 2,700+ library partners. In 2021, the BingePass was introduced, offering patrons seven days to access entire collections with just one borrow. In 2022, Apple CarPlay and Android Auto become available, giving users safe and easy access while driving. In 2023, manga joins Hoopla's comic collection, adding 1.5 million titles to Hoopla's offerings. In January 2025, Hoopla introduced a new streaming feature called SeasonPass. Building on the existing BingePass model, SeasonPass allows users to borrow an entire season of a television series with a single borrow. == Business model == Hoopla is free-of-charge for patrons of participating libraries. The content is paid for by library systems, using a "per circulation transaction model". == Content == Hoopla claims to have over 500,000 content titles across six formats, including over 25,000 comic books. As of November 2016, Hoopla's content comprised 35% audiobooks (for which Hoopla has contracts with publishers such as Blackstone Audio, HarperCollins, Simon & Schuster Audio, Tantor Audio, and others), followed by 22% movies (for which Hoopla has motion picture contracts with publishers such as Disney, Lionsgate, Starz, Warner Bros., and others), 19% music, 12% ebooks, 6% comics, and 6% television. One drawback is that Hoopla has few new bestsellers. In February 2025, 404 Media reported that Hoopla's collection includes books created by generative AI with fictional authors and dubious quality. Often not labeled as AI-produced or fact-checked, this AI slop can cost libraries money when checked out by unsuspecting patrons. Libraries like Sacramento Public library have questioned the sustainability of Hoopla's pay-per-use model and have considered transitioning to other digital platforms. === Areas served === Hoopla expanded to serve Australia and New Zealand in June 2021. == Technology == Hoopla content can be borrowed and consumed on the web, or via the native Android or iOS apps. Hoopla broadcasts only in Standard definition unlike most of its competitors such as Kanopy. == Parent company == John Eldred and Jeff Jankowski founded Hoopla's parent company, Midwest Tape, in 1989. Midwest Tape is a library vendor of physical media such as audiobooks, CDs, and DVD/Blu-ray. == Controversy == Hoopla and Midwest Tapes were censured by the Library Freedom Project and Library Futures in a joint statement for hosting what it described as "fascist propaganda", including a recent English translation of A New Nobility of Blood and Soil by Richard Walther Darré of the SS and books related to Holocaust denial, in public library collections without the input from the staff. Criticism was also directed at the inclusion of books on homosexuality, abortion, and vaccines claimed by the Library Freedom Project and Library Futures to be misinformation. On February 17, 2022, Hoopla removed a number of titles after public outcry about Holocaust denial books available on the app under non-fiction. The advocacy groups expressed appreciation for the response, however state that it is "insufficient," as they maintain concerns about the company's practices in selecting materials and lack of transparency.

Text-to-video model

A text-to-video model is a form of generative artificial intelligence that uses a natural language description as input to produce a video relevant to the input text. Advancements during the 2020s in the generation of high-quality, text-conditioned videos have largely been driven by the development of video diffusion models. == Models == There are different models, including open source models. Chinese-language input CogVideo is the earliest text-to-video model "of 9.4 billion parameters" to be developed, with its demo version of open source codes first presented on GitHub in 2022. That year, Meta Platforms released a partial text-to-video model called "Make-A-Video", and Google's Brain (later Google DeepMind) introduced Imagen Video, a text-to-video model with 3D U-Net. === 2023 === In February 2023, Runway released Gen-1 and Gen-2, among the first commercially available text-to-video and video-to-video models accessible to the public through a web interface. Gen-1, initially released as a video-to-video model, allowed users to transform existing video footage using text or image prompts. Gen-2, introduced in March 2023 and made publicly available in June 2023, added text-to-video capabilities, enabling users to generate videos from text prompts alone. In March 2023, a research paper titled "VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation" was published, presenting a novel approach to video generation. The VideoFusion model decomposes the diffusion process into two components: base noise and residual noise, which are shared across frames to ensure temporal coherence. By utilizing a pre-trained image diffusion model as a base generator, the model efficiently generated high-quality and coherent videos. Fine-tuning the pre-trained model on video data addressed the domain gap between image and video data, enhancing the model's ability to produce realistic and consistent video sequences. In the same month, Adobe introduced Firefly AI as part of its features. === 2024 === In January 2024, Google announced development of a text-to-video model named Lumiere which is anticipated to integrate advanced video editing capabilities. Matthias Niessner and Lourdes Agapito at AI company Synthesia work on developing 3D neural rendering techniques that can synthesise realistic video by using 2D and 3D neural representations of shape, appearances, and motion for controllable video synthesis of avatars. In June 2024, Luma Labs launched its Dream Machine video tool. That same month, Kuaishou extended its Kling AI text-to-video model to international users. In July 2024, TikTok owner ByteDance released Jimeng AI in China, through its subsidiary, Faceu Technology. By September 2024, the Chinese AI company MiniMax debuted its video-01 model, joining other established AI model companies like Zhipu AI, Baichuan, and Moonshot AI, which contribute to China's involvement in AI technology. In December 2024 Lightricks launched LTX Video as an open source model. === 2025 === Alternative approaches to text-to-video models include Google's Phenaki, Hour One, Colossyan, Runway's Gen-3 Alpha, and OpenAI's Sora, Several additional text-to-video models, such as Plug-and-Play, Text2LIVE, and TuneAVideo, have emerged. FLUX.1 developer Black Forest Labs has announced its text-to-video model SOTA. Google was preparing to launch a video generation tool named Veo for YouTube Shorts in 2025. In May 2025, Google launched the Veo 3 iteration of the model. It was noted for its impressive audio generation capabilities, which were a previous limitation for text-to-video models. In July 2025 Lightricks released an update to LTX Video capable of generating clips reaching 60 seconds, and in October 2025 it released LTX-2, with audio capabilities built in. === 2026 === In February 2026, ByteDance released Seedance 2.0, it was noted for its impressive realistic generation, motion and camera control and 15 second generation, however the model faced huge critiscism from Motion Picture Association for copyright infringement. After viewing a viral clip of a fight between actors Brad Pitt and Tom Cruise, Rhett Reese, who is the co-writer of Deadpool & Wolverine and Zombieland announced that on social media "I hate to say it. It’s likely over for us," further stating that "In next to no time, one person is going to be able to sit at a computer and create a movie indistinguishable from what Hollywood now releases." == Architecture and training == There are several architectures that have been used to create text-to-video models. Similar to text-to-image models, these models can be trained using Recurrent Neural Networks (RNNs) such as long short-term memory (LSTM) networks, which has been used for Pixel Transformation Models and Stochastic Video Generation Models, which aid in consistency and realism respectively. An alternative for these include transformer models. Generative adversarial networks (GANs), Variational autoencoders (VAEs), — which can aid in the prediction of human motion — and diffusion models have also been used to develop the image generation aspects of the model. Text-video datasets used to train models include, but are not limited to, WebVid-10M, HDVILA-100M, CCV, ActivityNet, and Panda-70M. These datasets contain millions of original videos of interest, generated videos, captioned-videos, and textual information that help train models for accuracy. Text-video datasets used to train models include, but are not limited to PromptSource, DiffusionDB, and VidProM. These datasets provide the range of text inputs needed to teach models how to interpret a variety of textual prompts. The video generation process involves synchronizing the text inputs with video frames, ensuring alignment and consistency throughout the sequence. This predictive process is subject to decline in quality as the length of the video increases due to resource limitations. The Will Smith Eating Spaghetti test is a benchmark for models. == Limitations == Despite the rapid evolution of text-to-video models in their performance, a primary limitation is that they are very computationally heavy which limits its capacity to provide high quality and lengthy outputs. Additionally, these models require a large amount of specific training data to be able to generate high quality and coherent outputs, which brings about the issue of accessibility. Moreover, models may misinterpret textual prompts, resulting in video outputs that deviate from the intended meaning. This can occur due to limitations in capturing semantic context embedded in text, which affects the model's ability to align generated video with the user's intended message. Various models, including Make-A-Video, Imagen Video, Phenaki, CogVideo, GODIVA, and NUWA, are currently being tested and refined to enhance their alignment capabilities and overall performance in text-to-video generation. Another issue with the outputs is that text or fine details in AI-generated videos often appear garbled, a problem that stable diffusion models also struggle with. Examples include distorted hands and unreadable text. == Ethics == The deployment of text-to-video models raises ethical considerations related to content generation. These models have the potential to create inappropriate or unauthorized content, including explicit material, graphic violence, misinformation, and likenesses of real individuals without consent. Ensuring that AI-generated content complies with established standards for safe and ethical usage is essential, as content generated by these models may not always be easily identified as harmful or misleading. The ability of AI to recognize and filter out NSFW or copyrighted content remains an ongoing challenge, with implications for both creators and audiences. == Impacts and applications == Text-to-video models offer a broad range of applications that may benefit various fields, from educational and promotional to creative industries. These models can streamline content creation for training videos, movie previews, gaming assets, and visualizations, making it easier to generate content. During the Russo-Ukrainian war, fake videos made with artificial intelligence were created as part of a propaganda war against Ukraine and shared in social media. These included depictions of children in the Ukrainian Armed Forces, fake ads targeting children encouraging them to denounce critics of the Ukrainian government, or fictitious statements by Ukrainian President Volodymyr Zelenskyy about the country's surrender, among others. === Movies === Kaur vs Kore is the first Indian feature film made using generative AI which features dual role for the AI character of Sunny Leone, set to release in 2026. Chiranjeevi Hanuman – The Eternal is an Indian movie made entirely using Generative AI created by Vijay Subramaniam which is set for theatrical release in 2026. The movie was widely criticised by the Film makers in the Bollywood industr

Quality of experience

Quality of experience (QoE) is a measure of the delight or annoyance of a customer's experiences with a service (e.g., web browsing, phone call, TV broadcast). QoE focuses on the entire service experience; it is a holistic concept, similar to the field of user experience, but with its roots in telecommunication. QoE is an emerging multidisciplinary field based on social psychology, cognitive science, economics, and engineering science, focused on understanding overall human quality requirements. == Definition and concepts == In 2013, within the context of the COST Action QUALINET, QoE has been defined as:The degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and / or enjoyment of the application or service in the light of the user’s personality and current state.This definition has been adopted in 2016 by the International Telecommunication Union in Recommendation ITU-T P.10/G.100. Before, various definitions of QoE had existed in the domain, with the above-mentioned definition now finding wide acceptance in the community. QoE has historically emerged from Quality of Service (QoS), which attempts to objectively measure service parameters (such as packet loss rates or average throughput). QoS measurement is most of the time not related to a customer, but to the media or network itself. QoE however is a purely subjective measure from the user's perspective of the overall quality of the service provided, by capturing people's aesthetic and hedonic needs. QoE looks at a vendor's or purveyor's offering from the standpoint of the customer or end user, and asks, "What mix of goods, services, and support, do you think will provide you with the perception that the total product is providing you with the experience you desired and/or expected?" It then asks, "Is this what the vendor/purveyor has actually provided?" If not, "What changes need to be made to enhance your total experience?" In short, QoE provides an assessment of human expectations, feelings, perceptions, cognition and satisfaction with respect to a particular product, service or application. QoE is a blueprint of all human subjective and objective quality needs and experiences arising from the interaction of a person with technology and with business entities in a particular context. Although QoE is perceived as subjective, it is an important measure that counts for customers of a service. Being able to measure it in a controlled manner helps operators understand what may be wrong with their services and how to improve them. == QoE factors == QoE aims at taking into consideration every factor that contributes to a user's perceived quality of a system or service. This includes system, human and contextual factors. The following so-called "influence factors" have been identified and classified by Reiter et al.: Human Influence Factors Low-level processing (visual and auditory acuity, gender, age, mood, …) Higher-level processing (cognitive processes, socio-cultural and economic background, expectations, needs and goals, other personality traits…) System Influence Factors Content-related Media-related (encoding, resolution, sample rate, …) Network-related (bandwidth, delay, jitter, …) Device-related (screen resolution, display size, …) Context Influence Factors Physical context (location and space) Temporal context (time of day, frequency of use, …) Social context (inter-personal relations during experience) Economic context Task context (multitasking, interruptions, task type) Technical and information context (relationship between systems) Studies in the field of QoE have typically focused on system factors, primarily due to its origin in the QoS and network engineering domains. Through the use of dedicated test laboratories, the context is often sought to be kept constant. == QoE versus User Experience == QoE is strongly related to but different from the field of User Experience (UX), which also focuses on users' experiences with services. Historically, QoE has emerged from telecommunication research, while UX has its roots in Human–Computer Interaction. Both fields can be considered multi-disciplinary. In contrast to UX, the goal of improving QoE for users was more strongly motivated by economic needs. Wechsung and De Moor identify the following key differences between the fields: == QoE measurement == As a measure of the end-to-end performance at the service level from the user's perspective, QoE is an important metric for the design of systems and engineering processes. This is particularly relevant for video services because – due to their high traffic demands –, bad network performance may highly affect the user's experience. So, when designing systems, the expected output, i.e. the expected QoE, is often taken into account – also as a system output metric and optimization goal. To measure this level of QoE, human ratings can be used. The mean opinion score (MOS) is a widely used measure for assessing the quality of media signals. It is a limited form of QoE measurement, relating to a specific media type, in a controlled environment and without explicitly taking into account user expectations. The MOS as an indicator of experienced quality has been used for audio and speech communication, as well as for the assessment of quality of Internet video, television and other multimedia signals, and web browsing. Due to inherent limitations in measuring QoE in a single scalar value, the usefulness of the MOS is often debated. Subjective quality evaluation requires a lot of human resources, establishing it as a time-consuming process. Objective evaluation methods can provide quality results faster, but require dedicated computing resources. Since such instrumental video quality algorithms are often developed based on a limited set of subjective data, their QoE prediction accuracy may be low when compared to human ratings. QoE metrics are often measured at the end devices and can conceptually be seen as the remaining quality after the distortion introduced during the preparation of the content and the delivery through the network, until it reaches the decoder at the end device. There are several elements in the media preparation and delivery chain, and some of them may introduce distortion. This causes degradation of the content, and several elements in this chain can be considered as "QoE-relevant" for the offered services. The causes of degradation are applicable for any multimedia service, that is, not exclusive to video or speech. Typical degradations occur at the encoding system (compression degradation), transport network, access network (e.g., packet loss or packet delay), home network (e.g. WiFi performance) and end device (e.g. decoding performance). == QoE management == Several QoE-centric network management and bandwidth management solutions have been proposed, which aim to improve the QoE delivered to the end-users. When managing a network, QoE fairness may be taken into account in order to keep the users sufficiently satisfied (i.e., high QoE) in a fair manner. From a QoE perspective, network resources and multimedia services should be managed in order to guarantee specific QoE levels instead of classical QoS parameters, which are unable to reflect the actual delivered QoE. A pure QoE-centric management is challenged by the nature of the Internet itself, as the Internet protocols and architecture were not originally designed to support today's complex and high demanding multimedia services. As an example for an implementation of QoE management, network nodes can become QoE-aware by estimating the status of the multimedia service as perceived by the end-users. This information can then be used to improve the delivery of the multimedia service over the network and proactively improve the users' QoE. This can be achieved, for example, via traffic shaping. QoE management gives the service provider and network operator the capability to minimize storage and network resources by allocating only the resources that are sufficient to maintain a specific level of user satisfaction. As it may involve limiting resources for some users or services in order to increase the overall network performance and QoE, the practice of QoE management requires that net neutrality regulations are considered.