Learning Library

← Back to Library

Nlp

9 items in this topic

Paper

Learnable Multipliers for Adaptive Scale in LLM Matrix Layers

  • Attaching a learnable scalar multiplier to each weight matrix lets the model escape the suboptimal weight‑norm equilibrium imposed by fixed weight decay.
  • Extending this idea to per‑row and per‑column multipliers further frees individual dimension scales, yielding a more expressive variant of μP‑style scaling.
Paper

Token‑Level Collaborative Decoding for Efficient LLM Reasoning

  • RelayLLM lets a small language model act as a controller, emitting a special command token to summon the large model only for critical tokens, reducing LLM usage to ~1 % of generated tokens.
  • A two‑stage training regimen (warm‑up plus Group Relative Policy Optimization) teaches the SLM when to generate autonomously and when to request help, balancing independence with strategic assistance.
Paper

Agent-as-a-Judge: Structured LLM Evaluation Framework

  • Pure LLM judges often mis‑evaluate complex, multi‑step outputs because they lack explicit reasoning and verification mechanisms.
  • The paper introduces a modular “agent‑as‑judge” system that first plans an evaluation strategy, then invokes external tools (e.g., calculators, code runners) to verify intermediate claims.
Paper

Generalized Referring Expressions for Multi‑Target Vision‑Language Tasks

  • Introduces GREx, a unified benchmark that expands traditional referring expression tasks (RES, REC, REG) to support single‑target, multi‑target, and no‑target expressions, enabling more realistic and flexible language‑vision interactions.
  • Releases gRefCOCO, the first large‑scale dataset containing annotated images with all three expression types, while remaining backward‑compatible with existing RES/REC datasets for fair comparison.
Paper

Topological Reasoning via Holonomic Neural Networks

  • Traditional Transformers and RNNs reside in a “Metric Phase” where causal order can be broken by semantic noise, causing hallucinations.
  • By formulating inference as a Symmetry‑Protected Topological (SPT) phase, logical operations become analogous to non‑Abelian anyon braiding, giving them immunity to local perturbations.
Paper

Mamba: Fast Linear‑Time Sequence Modeling with Input‑Conditioned State Spaces

  • Making SSM parameters input‑dependent gives the model content‑based gating, allowing selective propagation or forgetting of information and closing the performance gap with attention on discrete modalities.
  • A hardware‑aware parallel recurrence algorithm restores efficiency lost by dropping convolutions, delivering true linear‑time computation with constant‑factor speedups on modern GPUs/TPUs.
Paper

Spectral Attention Diagnostics Reveal Valid Mathematical Reasoning

  • Treating attention matrices as token‑level graphs lets spectral analysis separate sound from unsound mathematical proofs.
  • Four graph‑spectral metrics (Fiedler value, high‑frequency energy ratio, smoothness, spectral entropy) achieve huge effect sizes (Cohen’s d ≤ 3.30) across seven models from four families, without any training or fine‑tuning.
Paper

Hypergraph‑Based Memory for Enhanced Multi‑Step RAG

  • Conventional RAG memories act as static fact repositories, neglecting the higher‑order relations needed for deep reasoning.
  • HGMem models the working memory as a hypergraph where each hyperedge groups related facts, enabling progressive construction of complex relational structures.
Paper

Hierarchical Language Modeling with Dynamic Concept Compression

  • DLCM learns variable‑length “concepts” on the fly, moving computation from dense token streams to a compact latent space where reasoning is cheaper and more focused.
  • A new compression‑aware scaling law separates token‑level capacity, concept‑level reasoning capacity, and compression ratio, allowing principled FLOP allocation across the hierarchy.