{"id":29451,"date":"2026-04-17T13:29:13","date_gmt":"2026-04-17T17:29:13","guid":{"rendered":"https:\/\/www.crim.ca\/zerospam-les-spams-se-diversifient-au-meme-rythme-que-le-contenu-web\/"},"modified":"2026-04-17T13:29:13","modified_gmt":"2026-04-17T17:29:13","slug":"zerospam-les-spams-se-diversifient-au-meme-rythme-que-le-contenu-web","status":"publish","type":"post","link":"https:\/\/www.crim.ca\/en\/zerospam-les-spams-se-diversifient-au-meme-rythme-que-le-contenu-web\/","title":{"rendered":"Zerospam &#8211; As Web content diversifies, so does spam"},"content":{"rendered":"<p>319 billion. The number of <em>spam messages<\/em> sent and received every day (December 2021). <\/p>\n<p><img fetchpriority=\"high\" decoding=\"async\" class=\"alignright wp-image-20657 size-medium\" src=\"https:\/\/www.crim.ca\/wp-content\/uploads\/2023\/03\/2-300x214.jpg\" alt=\"Person sitting at a desk in front of two screens displaying computer code, with a laptop, notebook and work accessories, in a brick-walled office space.\" width=\"300\" height=\"214\" srcset=\"https:\/\/www.crim.ca\/wp-content\/uploads\/2023\/03\/2-300x214.jpg 300w, https:\/\/www.crim.ca\/wp-content\/uploads\/2023\/03\/2-1024x732.jpg 1024w, https:\/\/www.crim.ca\/wp-content\/uploads\/2023\/03\/2-768x549.jpg 768w, https:\/\/www.crim.ca\/wp-content\/uploads\/2023\/03\/2-1536x1097.jpg 1536w, https:\/\/www.crim.ca\/wp-content\/uploads\/2023\/03\/2.jpg 2000w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/>Among spam types, marketing and advertising e-mails are the most important (36%).<\/p>\n<p>But marketing spam is more than just an irritant &#8211; it&#8217;s a real waste of time, especially for a company. Sorting through <em>spam<\/em> can take 1 to 2 minutes. For a company with 500 employees, <strong>it can take 1,000 minutes, and for an 8-hour day, spam costs the company more than 2 days every day<\/strong>!  <\/p>\n<p><em>Spam<\/em> reduces productivity, increases traffic to the resource-hungry server, and slows down system performance. Added to this are fraudulent sales messages and phishing, which contribute in large part (91%) to increasingly sophisticated cybercriminal attacks. <\/p>\n<p>&nbsp;<\/p>\n<h3><strong>Spam filters, the nets that spam manages to slip through.<\/strong><\/h3>\n<p>It&#8217;s easy to see why companies invest large sums in anti-spam filtering systems, whose sole aim is to separate <em>spam<\/em><strong> (<\/strong>generally unsolicited promotional messages) from <em>hams<\/em> (wanted messages). These systems have to deal with the elements that make up the e-mail envelope: the (for example: Amazon.ca), the subject (the heading that sums up the subject of your e-mail), the sender of the e-mail, the IP address. <\/p>\n<p>&nbsp;<\/p>\n<p>These filters are generally rule-based (for example, an IP address on a blacklist will trigger a rule). They list the characteristic elements of spam and assign a score to the e-mail. The more the filters detect these errors, the higher the score, and the e-mail will be considered spam.  <\/p>\n<p>As spam techniques are constantly evolving, keeping up to date is a challenge. In fact, spammers manage to bypass these filters and make their spam look like spam. <\/p>\n<p><strong> <\/strong><\/p>\n<h3><strong>Approaches to differentiating ham from spam<\/strong><\/h3>\n<p>ZEROSPAM (since acquired by Hornetsecurity) offers a complete anti-spam secure messaging solution. This diversity makes its role all the more complex. Spammers always manage to outwit the filters, and the company has to recreate the rules. But the more rules are added, the greater the risk that they will contradict each other. In short, there&#8217;s always a new wave of spam that filters can&#8217;t detect!    <\/p>\n<p>To alleviate this problem, Z\u00e9rospam would like to explore the possibilities of integrating the meaning of content for spam detection.<\/p>\n<p>The CRIM approach is based on <strong>language models <\/strong>that enable the machine to represent the meaning of words beyond their written form. This representation is transferred to a computer program called a classifier, which learns to differentiate spam e-mails from ham. <\/p>\n<p>The models used by CRIM are multilingual (see box), which means they can identify words that qualify as spam and can be spotted in several languages, for example pill$ medicament$.<\/p>\n<p>As Zerospam does not store the e-mails that pass through its servers, CRIM has ensured that these models can be developed and trained on external corpora without compromising their representativeness.<\/p>\n<p>The CRIM experts selected two language models that they felt showed promise in this context: the LASER method and Sentence-Bert multilingual (see box).<\/p>\n<p>&nbsp;<\/p>\n<h3><strong>The results  <\/strong><\/h3>\n<p>Using both approaches (CRIM language models and written form) together, we obtained better results than Zerospam.<\/p>\n<p>Indeed, the approach based on written form alone identified 59% of spam in a database. By using our method based on word meaning and written form, the rate rises to 76%, a clear improvement in spam detection. <\/p>\n<p>Zerospam is now equipped with a representation method, a ham\/spam classifier, and a robust methodology and tools for testing other representation methods.<\/p>\n<p>In conclusion, CRIM has successfully demonstrated that pre-trained multilingual language models help spam detection.<\/p>\n<p>The experts are optimistic: with training on a <strong>larger<\/strong> corpus, closer to the actual flow of e-mails, performance would be better.<\/p>\n<p>&nbsp;<\/p>\n<h3><strong>Spam from the future<\/strong><\/h3>\n<p>With the rapid evolution of pre-trained language models, performance will continue to improve. And the tools provided with Zerospam will make it easy to re-train emails on new spam corpora and stay on top of the ever-changing spam landscape. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>319 billion. The number of spam messages sent and received every day (December 2021). Among spam types, marketing and advertising e-mails are the most important (36%). But marketing spam is more than just an irritant &#8211; it&#8217;s a real waste of time, especially for a company. Sorting through spam can take 1 to 2 minutes. [&hellip;]<\/p>\n","protected":false},"author":18,"featured_media":20630,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":"","_links_to":"","_links_to_target":""},"categories":[112],"tags":[83,522,523,524],"class_list":["post-29451","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-success-story","tag-data","tag-offres-personnalisees-en","tag-paiement-en","tag-transactions-en"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.crim.ca\/en\/wp-json\/wp\/v2\/posts\/29451","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.crim.ca\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.crim.ca\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.crim.ca\/en\/wp-json\/wp\/v2\/users\/18"}],"replies":[{"embeddable":true,"href":"https:\/\/www.crim.ca\/en\/wp-json\/wp\/v2\/comments?post=29451"}],"version-history":[{"count":0,"href":"https:\/\/www.crim.ca\/en\/wp-json\/wp\/v2\/posts\/29451\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.crim.ca\/en\/wp-json\/wp\/v2\/media\/20630"}],"wp:attachment":[{"href":"https:\/\/www.crim.ca\/en\/wp-json\/wp\/v2\/media?parent=29451"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.crim.ca\/en\/wp-json\/wp\/v2\/categories?post=29451"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.crim.ca\/en\/wp-json\/wp\/v2\/tags?post=29451"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}