Tech News Directory

SA-EMO: Structure-Aligned Encoder Mixture of Operators for Generalizable Full-waveform Inversion

Research

Arxiv • 14 hours ago

SA-EMO: Structure-Aligned Encoder Mixture of Operators for Generalizable Full-waveform Inversion

arXiv:2511.11627v1 Announce Type: new Abstract: Full-waveform inversion (FWI) can produce high-resolution subsurface models, yet it remains inherently ill-posed, highly nonlinear, and computationally intensive. Although recent deep learning and numerical acceleration methods have improved speed and

TimeStampEval: A Simple LLM Eval and a Little Fuzzy Matching Trick to Improve Search Accuracy

Research

Arxiv • 14 hours ago

TimeStampEval: A Simple LLM Eval and a Little Fuzzy Matching Trick to Improve Search Accuracy

arXiv:2511.11594v1 Announce Type: new Abstract: Traditional fuzzy matching often fails when searching for quotes that are semantically identical but syntactically different across documents-a common issue when aligning official written records with speech-to-text transcripts. We introduce TimeStamp

Research

Arxiv • 14 hours ago

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

arXiv:2511.11793v1 Announce Type: new Abstract: We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and information-seeking capabilities. Unlike previous agents that only scale up model size or context length, MiroThinker explores interaction scal

On the Notion that Language Models Reason

Research

Arxiv • 14 hours ago

On the Notion that Language Models Reason

arXiv:2511.11810v1 Announce Type: new Abstract: Language models (LMs) are said to be exhibiting reasoning, but what does this entail? We assess definitions of reasoning and how key papers in the field of natural language processing (NLP) use the notion and argue that the definitions provided are no

Scaling Open-Weight Large Language Models for Hydropower Regulatory Information Extraction: A Systematic Analysis

Research

Arxiv • 14 hours ago

Scaling Open-Weight Large Language Models for Hydropower Regulatory Information Extraction: A Systematic Analysis

arXiv:2511.11821v1 Announce Type: new Abstract: Information extraction from regulatory documents using large language models presents critical trade-offs between performance and computational resources. We evaluated seven open-weight models (0.6B-70B parameters) on hydropower licensing documentatio

Towards Autoformalization of LLM-generated Outputs for Requirement Verification

Research

Arxiv • 14 hours ago

Towards Autoformalization of LLM-generated Outputs for Requirement Verification

arXiv:2511.11829v1 Announce Type: new Abstract: Autoformalization, the process of translating informal statements into formal logic, has gained renewed interest with the emergence of powerful Large Language Models (LLMs). While LLMs show promise in generating structured outputs from natural languag

Three Stage Narrative Analysis; Plot-Sentiment Breakdown, Structure Learning and Concept Detection

Research

Arxiv • 14 hours ago

Three Stage Narrative Analysis; Plot-Sentiment Breakdown, Structure Learning and Concept Detection

arXiv:2511.11857v1 Announce Type: new Abstract: Story understanding and analysis have long been challenging areas within Natural Language Understanding. Automated narrative analysis requires deep computational semantic representations along with syntactic processing. Moreover, the large volume of n

Identifying Imaging Follow-Up in Radiology Reports: A Comparative Analysis of Traditional ML and LLM Approaches

Research

Arxiv • 14 hours ago

Identifying Imaging Follow-Up in Radiology Reports: A Comparative Analysis of Traditional ML and LLM Approaches

arXiv:2511.11867v1 Announce Type: new Abstract: Large language models (LLMs) have shown considerable promise in clinical natural language processing, yet few domain-specific datasets exist to rigorously evaluate their performance on radiology tasks. In this work, we introduce an annotated corpus of

MedPT: A Massive Medical Question Answering Dataset for Brazilian-Portuguese Speakers

Research

Arxiv • 14 hours ago

MedPT: A Massive Medical Question Answering Dataset for Brazilian-Portuguese Speakers

arXiv:2511.11878v1 Announce Type: new Abstract: While large language models (LLMs) show transformative potential in healthcare, their development remains focused on high-resource languages, creating a critical barrier for others as simple translation fails to capture unique clinical and cultural nu

ClinStructor: AI-Powered Structuring of Unstructured Clinical Texts

Research

Arxiv • 14 hours ago

ClinStructor: AI-Powered Structuring of Unstructured Clinical Texts

arXiv:2511.11883v1 Announce Type: new Abstract: Clinical notes contain valuable, context-rich information, but their unstructured format introduces several challenges, including unintended biases (e.g., gender or racial bias), and poor generalization across clinical settings (e.g., models trained o

Context-Emotion Aware Therapeutic Dialogue Generation: A Multi-component Reinforcement Learning Approach to Language Models for Mental Health Support

Research

Arxiv • 14 hours ago

Context-Emotion Aware Therapeutic Dialogue Generation: A Multi-component Reinforcement Learning Approach to Language Models for Mental Health Support

arXiv:2511.11884v1 Announce Type: new Abstract: Mental health illness represents a substantial global socioeconomic burden, with COVID-19 further exacerbating accessibility challenges and driving increased demand for telehealth mental health support. While large language models (LLMs) offer promisi

Additive Large Language Models for Semi-Structured Text

Research

Arxiv • 14 hours ago

Additive Large Language Models for Semi-Structured Text

arXiv:2511.11922v1 Announce Type: new Abstract: Large Language Models have advanced clinical text classification, but their opaque predictions remain a critical barrier to practical adoption in research and clinical settings where investigators and physicians need to understand which parts of a pat

InData: Towards Secure Multi-Step, Tool-Based Data Analysis

Research

Arxiv • 14 hours ago

InData: Towards Secure Multi-Step, Tool-Based Data Analysis

arXiv:2511.11933v1 Announce Type: new Abstract: Large language model agents for data analysis typically generate and execute code directly on databases. However, when applied to sensitive data, this approach poses significant security risks. To address this issue, we propose a security-motivated al

Improving LLM's Attachment to External Knowledge In Dialogue Generation Tasks Through Entity Anonymization

Research

Arxiv • 14 hours ago

Improving LLM's Attachment to External Knowledge In Dialogue Generation Tasks Through Entity Anonymization

arXiv:2511.11946v1 Announce Type: new Abstract: Knowledge graph-based dialogue generation (KG-DG) is a challenging task requiring models to effectively incorporate external knowledge into conversational responses. While large language models (LLMs) have achieved impressive results across various NL

On the Entropy Calibration of Language Models

Research

Arxiv • 14 hours ago

On the Entropy Calibration of Language Models

arXiv:2511.11966v1 Announce Type: new Abstract: We study the problem of entropy calibration, which asks whether a language model's entropy over generations matches its log loss on human text. Past work found that models are miscalibrated, with entropy per step increasing (and text quality decreasin

A Reasoning Paradigm for Named Entity Recognition

Research

Arxiv • 14 hours ago

A Reasoning Paradigm for Named Entity Recognition

arXiv:2511.11978v1 Announce Type: new Abstract: Generative LLMs typically improve Named Entity Recognition (NER) performance through instruction tuning. They excel at generating entities by semantic pattern matching but lack an explicit, verifiable reasoning mechanism. This "cognitive shortcutting"

Critical or Compliant? The Double-Edged Sword of Reasoning in Chain-of-Thought Explanations

Research

Arxiv • 14 hours ago

Critical or Compliant? The Double-Edged Sword of Reasoning in Chain-of-Thought Explanations

arXiv:2511.12001v1 Announce Type: new Abstract: Explanations are often promoted as tools for transparency, but they can also foster confirmation bias; users may assume reasoning is correct whenever outputs appear acceptable. We study this double-edged role of Chain-of-Thought (CoT) explanations in

CURE: Cultural Understanding and Reasoning Evaluation - A Framework for "Thick" Culture Alignment Evaluation in LLMs

Research

Arxiv • 14 hours ago

CURE: Cultural Understanding and Reasoning Evaluation - A Framework for "Thick" Culture Alignment Evaluation in LLMs

arXiv:2511.12014v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed in culturally diverse environments, yet existing evaluations of cultural competence remain limited. Existing methods focus on de-contextualized correctness or forced-choice judgments, overlooking

Exploring Parameter-Efficient Fine-Tuning and Backtranslation for the WMT 25 General Translation Task

Research

Arxiv • 14 hours ago

Exploring Parameter-Efficient Fine-Tuning and Backtranslation for the WMT 25 General Translation Task

arXiv:2511.12109v1 Announce Type: new Abstract: In this paper, we explore the effectiveness of combining fine-tuning and backtranslation on a small Japanese corpus for neural machine translation. Starting from a baseline English{\textrightarrow}Japanese model (COMET = 0.460), we first apply backtra

LLMLagBench: Identifying Temporal Training Boundaries in Large Language Models

Research

Arxiv • 14 hours ago

LLMLagBench: Identifying Temporal Training Boundaries in Large Language Models

arXiv:2511.12116v1 Announce Type: new Abstract: Large Language Models (LLMs) are pretrained on textual data up to a specific temporal cutoff. This creates a strict knowledge boundary beyond which models cannot provide accurate information without querying external sources. More subtly, when this li