BPG is committed to discovery and dissemination of knowledge
Systematic Reviews
Copyright ©The Author(s) 2025.
World J Orthop. Nov 18, 2025; 16(11): 110276
Published online Nov 18, 2025. doi: 10.5312/wjo.v16.i11.110276
Table 1 Summary of included studies
Ref.
Country
Study design
Procedure type
Sample size (RA/comparator)
Robotic platform
Comparator type
Wang et al[1], 2023ChinaProspectiveMIS-TLIF61/62TiRobotFreehand fluoroscopy
Tong et al[2], 2024ChinaRetrospectiveOLIF16/22Third-gen Mazor X navigation robotFluoroscopy
Shafi et al[3], 2022United StatesRetrospectiveMIS-TLIF92/130ExcelsiusGPSIntraoperative navigation
Griepp et al[4], 2024United StatesRetrospectiveMIS-TLIF50/133Mazor X Stealth robotFluoroscopy-assisted
Lin et al[5], 2022TaiwanRetrospectiveMIS-TLIF75/149ROSAFreehand fluoroscopy
Li et al[6], 2025ChinaRetrospectiveLLIF31/28TianJi RobotTraditional LLIF
Heath et al[7], 2024TaiwanRetrospectiveMIS-TLIF42/58ROSA ONEO-arm navigation
Chang et al[11], 2022ChinaProspectiveMIS-TLIF26/32TiRobotMIS-TLIF
Lai et al[15], 2022TaiwanRetrospectiveTLIF29/79RenaissanceFreehand fluoroscopy
Zhang et al[20], 2019ChinaProspectiveTLIF43/44TiRobotFG
Zhang et al[21], 2019ChinaProspectiveTLIF50/50Robot assistedFG
Chen et al[22], 2021ChinaRetrospectiveMIS-TLIF52/52TiRobotFreehand open TLIF
Cui et al[23], 2021ChinaRetrospectiveMIS-TLIF23/25TiRobotOpen TLIF surgery
Feng et al[24], 2020ChinaRCTOLIF40/40TiRobotOpen freehand fluoroscopy
Li et al[25], 2024ChinaRetrospectiveMIS-TLIF58/56Tianji RobotFreehand fluoroscopy
Han et al[26], 2021ChinaRetrospectiveOLIF vs TLIF28/33TiRobotMIS-TLIF
De Biase et al[27], 2021United StatesRetrospectiveMIS-TLIF52/49Mazor XFG
Fayed et al[28], 2020United StatesRetrospectiveMIS-TLIF103/90ExcelsiusGPSFG
Yang et al[29], 2019ChinaRetrospectiveMIS-TLIF30/30Robot-assisted surgical systemFG
Schatlo et al[30], 2014GermanyRetrospectiveTLIF55/40MazorFG
Li et al[31], 2024ChinaRetrospectiveRA MIS-TLIF (group A) vs RA UBE-T/PLIF (group B) vs traditional MIS-TLIF (group C)A: 27, B: 30, C: 26Tianji Robot (3rd Gen)Traditional MIS-TLIF
Li et al[32], 2022ChinaRetrospectiveRA MIS-TLIF33/39Tianji RobotTraditional MIS-TLIF
Table 2 Combined accuracy and perioperative outcomes of included studies
Ref.
Classification
RA accuracy (grade A)
Comparator accuracy (grade A)
Statistical significance
Clinically acceptable screws (A + B)
Statistical significance
Operative time (RA vs comparator)
EBL (mL) (RA vs comparator)
Radiation exposure (RA vs comparator)
Notes
Wang et al[1], 2023Gertzbein-Robbins85.4%69.5%P < 0.00197.1% vs 95.0%NS160.25 ± 12.13 vs 154.35 ± 15.00 (P = 0.018)78.85 ± 33.52 vs 82.90 ± 20.91 (NS)Surgeon: 13.28 ± 3.09 vs 94.87 ± 6.02 (P < 0.001) patient: NSD (NS)Less disc height loss at adjacent segments (P < 0.001)
Tong et al[2], 2024Gertzbein-Robbins95.4%85.5%P < 0.0598.4% vs 95.5%P = 0.04158.8 minutes vs 129.9 minutes (P < 0.05)89.8 vs 117.3 (P < 0.05)13.3 vs 48.5 fluoroscopy counts (P < 0.05)Longer operative time in RA but reduced blood loss and radiation exposure. Higher screw accuracy (98.4% vs 95.5%)
Shafi et al[3], 2022Gertzbein-Robbins88.5%88.4%NS97.4% vs 95.3%NSNot reportedNot reportedSurgeon: Significantly lower with RN (P < 0.001)Fewer high-grade breaches (0% RN vs 1.2% ION, P = 0.05)
Griepp et al[4], 2024Gertzbein-RobbinsNot directly reported; lower revision rate (0%) in RA suggests higher accuracyNot reported; higher revision rate (2.4%) in FA groupP = 0.03 (lower revision rate in RA group)Not reportedNot reported33.3 ± 8.57 minutes/screw vs 30.7 ± 6.87 minutes/screw, P = 0.125)161.4 ± 365.7 vs 155.1 ± 194.7 (NS)4.9 ± 7.6 vs 20.3 ± 14.0 mGy/screw (P < 0.001)RA had longer anesthesia time (49.1 minutes/screw vs 43.6 minutes/screw, P = 0.009)
Lin et al[5], 2022Gertzbein-Robbins99.7%98.2%P = 0.0499.7% vs 98.2%P = 0.04280.7 vs 251.4 minutes (NS)313.7 vs 431.6 mL (P = 0.019)Not reportedRA reduced blood loss significantly. Shorter operative time for 4-level surgeries with RA. No difference in complications or pain outcomes
Li et al[6], 2025Gertzbein-Robbins96.8%92.9%NS99.2% vs 98.2%NS147 minutes vs 165 minutes (P = 0.04)124.4 vs 138.9 (NS)54.6 seconds vs 87.8 seconds (P < 0.01)Shorter fluoroscopy and operative time in RA, no significant difference in blood loss
Heath et al[7], 2024Gertzbein-Robbins98.0%80.0%P < 0.001100% vs 92.1%P = 0.003263.5 minutes vs 243.4 minutes (P = 0.28)340.6 vs 256.6 (NS)Not quantified (O-arm used in both groups)Longer operative time for RA in 2-level fusions (324.7 vs 266.4 min, P = 0.03). No medial breaches or revisions in either group
Chang et al[11], 2022Gertzbein-Robbins99.1%93.7%P < 0.05100% vs 98.4%P = 0.001208 ± 15.2 minutes vs 161 ± 7.9 minutes (P = 0.02)25 ± 10 vs 100 ± 20 (P = 0.01)Reduced (implied, not quantified)RA-TLIF had shorter incisions (1.4 cm vs 2.5 cm, P = 0.01). Lower screw misplacement rate (0.9% vs 6.3%, P < 0.05). Steep learning curve noted
Lai et al[15], 2022Not reportedNot reportedNot reportedNot reportedNot reportedNot reported259.0 minutes vs 225.0 minutes (NS)400.0 vs 366.7 mL (NS)Not reportedLower screw loosening rate with RA (4.3% vs 10.2%, P = 0.049). Similar complication rates (24.1% both groups). RA screws placed closer to upper endplate (P < 0.001)
Zhang et al[20], 2019Gertzbein-Robbins93.2%85.8%P = 0.0298.3% vs 93.6%P = 0.02165.3 ± 58.9 minutes vs 154.7 ± 46.0 minutes (P = 0.349)187.2 ± 95.2 mL vs 373.2 ± 320.3 mL (P = 0.001)Dose: 25.9 ± 14.2 μSv vs 70.5 ± 27.3 μSv (P < 0.001) time: 93.5 ± 37.9 seconds vs 70.5 ± 28.3 seconds (P = 0.002)Fewer facet violations: 5 vs 24 screws (P = 0.001). Lower revisions: 0 (RA) vs 2 (FG). Learning curve noted for RA (longer setup time)
Zhang et al[21], 2019Gertzbein-Robbins85.0%71.0%P = 0.01798.0% vs 94.0%NS184.7 minutes vs 117.8 minutes (P < 0.001)171.6 vs 362.0 mL (P = 0.001)30.3 vs 65.3 μSv (P < 0.001)Longer op time but less radiation in RA
Chen et al[22], 2021Gertzbein-Robbins92.3%77.4%P < 0.00198.6% vs 96.6%NS169.67 minutes vs 135.48 minutes (P < 0.001)92 vs 261 mL (P < 0.001)1.26 minutes vs 0.54 minutes (P < 0.001)Longer op time but less blood loss in RA
Cui et al[23], 2021Gertzbein-Robbins94.6%85.0%P = 0.025100% vs 100%NSRA MIS-TLIF: 135.1 ± 11.2 minutes. Open TLIF: 102.2 ± 7.1 minutes (P = 0.002)RA MIS-TLIF: 173.6 ± 17.9 mL open TLIF: 332.1 ± 23.5 mL (P = 0.005)Not reportedRA MIS-TLIF showed longer operative time (learning curve) but significantly reduced blood loss and postoperative drainage (97.5 mL vs 261.3 mL, P < 0.001). Faster recovery: Shorter hospitalization (7.3 days vs 10.0 days) and time to ambulation (1.5 days vs 2.9 days, P < 0.05). Less muscle atrophy: Paraspinal muscle cross-sectional area decreased by 3.9% (vs 14.5% in open TLIF, P = 0.016)
Feng et al[24], 2020Gertzbein-Robbins98.2%93.1%P = 0.039100% vs 99.4%NS196.25 minutes vs 230.63 minutes (P < 0.05)165 mL vs 237.5 mL (P < 0.05)Not reportedEfficiency benefit with RA
Li et al[25], 2024Gertzbein-Robbins87.5%70.1%P < 0.00198.3% vs 96.9%NS158.5 minutes vs 146.4 minutes (P < 0.001)58.5 vs 52.8 (NS)Surgeon: 13.8 vs 74.7 fluoroscopy counts (P < 0.001)Longer operative time in RA but reduced surgeon radiation exposure. Lower adjacent segment degeneration (0.63 mm vs 0.92 mm height loss, P = 0.001)
Han et al[26], 2021Gertzbein-Robbins92.9%90.9%NS97.3% vs 96.2%NSOLIF: 164.9 ± 56.0 minutes MIS-TLIF: 121.5 ± 48.2 minutes (P < 0.01)OLIF: 142.4 ± 89.4 mL MIS-TLIF: 291.5 ± 72.3 mL (P < 0.01)Not reportedOLIF had significantly longer operative time but less blood loss. OLIF required position change (lateral to prone), contributing to longer time
De Biase et al[27], 2021Gertzbein-Robbins97.4%93.9%NS99.8% vs 99.3%NS241 minutes vs 246 minutes (NS)73.8 mL vs 73.9 mL (NS)31.5 vs 59.5 mGy (P = 0.035)RA reduced radiation
Fayed et al[28], 2020Gertzbein-Robbins94.2%96.7%NS98.1% vs 100%NSNot reportedNot reportedNot reportedRA-PPS: 5.8% breach rate (6/103 screws), with 1.9% significant breaches (> 2 mm). FG-PPS: 3.3% breach rate (3/90 screws), with 1.1% significant breaches. Learning curve: 5 breaches in first 48 screws (10 cases) vs 1 breach in next 55 screws (10 cases). Lateral breaches (4/6) linked to facet hypertrophy/skiving
Yang et al[29], 2019Gertzbein-Robbins93.8%73.8%P = 0.01298.5% vs 96.9%NSNot reportedNot reportedNot reportedThe study focused on accuracy (pedicle screw placement) and facet joint violation, not perioperative efficiency metrics. RA group had significantly lower pedicle wall penetration (6.2% vs 26.2%) and facet joint violation (5.1% vs 15.6%) compared to FG group. No severe deviations (Neo grade III) in RA group
Schatlo et al[30], 2014Gertzbein-Robbins83.6%79.8%NS91.4% vs 87.1%NS205 minutes vs 189 minutes (NS)375 mL vs 713 mL (P < 0.01)Not reportedLower blood loss in RA
Li et al[31], 2024Gertzbein-Robbins96.3%95.0%P < 0.05A: 99%, B: 98%, C: 91%P < 0.05A: 146.9 ± 10.8, B: 172.5 ± 13.2, C: 169.0 ± 13.6 (A < B/C, P < 0.05)A: 89.3 ± 11.3, B: 74.4 ± 14.6, C: 111.6 ± 20.9 (B < A < C, P < 0.05)Significantly lower in A/B vs C (P < 0.05)Group A had shortest operation time. Group B had least blood loss. Group C had highest radiation exposure
Li et al[32], 2022Gertzbein-Robbins99.24%91.03%P = 0.002Not specifically reported in A + B form, but grade A alone was 99.24% in RANot explicitly reported for A + B, only for grade A154.75 ± 7.32 vs 172.22 ± 14.82 (P = 0.001)89.49 ± 18.63 vs 121.48 ± 20.55 (P = 0.001)Fluoroscopy: 59.54 ± 6.56 vs 70.67 ± 9.70 (P = 0.001)Robot group had shorter hospital stays (3.86 ± 1.17 days vs 5.03 ± 0.73 days)
Table 3 Radiographic outcomes
Ref.
Procedure
Sagittal alignment changes
Facet joint violation (RA vs comparator)
Notes
Wang et al[1], 2023MIS-TLIF (RA vs FA)Not reportedFJV grades: RA: 89.8% grade 0 (no violation) FA: 62.1% grade 0 (P < 0.001) - mean FJV grade lower in RA (0.24 vs 0.50; P < 0.001)Adjacent segment: Less disc height loss at proximal adjacent segment in RA (0.69 mm vs 0.93 mm; P < 0.001). Fusion rates: No difference (BSF-3: 88.5% RA vs 85.5% FA; P = 0.616)
Tong et al[2], 2024RA-OLIF vs FG OLIFSagittal alignment parameters (LL, SL, PI-LL) not explicitly reportedRA-OLIF: 4.7% FJV rate (61 grade 0, 2 grade 1, 1 grade 2)RA-OLIF had higher screw accuracy (98.4% grade A/B vs 95.5% in fluoroscopy; P = 0.015)
Focus on screw accuracy and FJV ratesFluoroscopy: 19.3% FJV rate (71 grade 0, 11 grade 1, 5 grade 2, 1 grade 3)Shorter-term benefits: Lower VAS-back at 3 days post-op (P = 0.003)
(P = 0.009)
Shafi et al[3], 2022MIS-TLIF (RN vs ION)Not reportedFJV rates: RN: 5.0% ION: 1.3% (P = 0.0017)Screw dimensions: RN allowed larger screw diameters (7.25 mm vs 6.72 mm; P < 0.001) and longer screws (48.4 mm vs 45.6 mm; P < 0.001)
Accuracy: Similar “ideal” (grade A) screw rates (88.5% RN vs 88.4% ION; P = 0.969), but RN eliminated high-grade breaches (0% grade E vs 1.2% in ION; P = 0.051)
Endplate breaches: Higher in RN (6.9% vs 1.3%; P = 0.001), but most were clinically insignificant
Griepp et al[4], 2024MIS-TLIF (RA vs O-arm navigation)Not reportedLateral breaches: RA: 4/210 (1.9%, all grade B)- ON: 24/304 (7.89%, grades C-D) medial breaches: 0 in both groupsNo revisions for malposition in either group
Lin et al[5], 2022MIS-TLIF (robot-guided vs freehand)Not reportedNot reportedPedicle screw breach rates: Robot-guided (0.27%) vs freehand (1.75%), P = 0.04
Lateral breaches more common in freehand group (9/12 breaches). No medial breaches with robotics
Li et al[6], 2025RA-SP-LLIF vs traditional LLIFSignificant postoperative improvements in LL (45.2°-51.5°), SL (24.0°-29.3°), and PI-LL (13.0°-7.8°) (P < 0.01). Gains in LL/SL/PI-LL were not sustained at final follow-up (P > 0.05 vs baseline). No difference in PT/SS changesNot explicitly reported, but the high screw accuracy (99.2% RA vs 98.2% traditional) suggests low riskComparable fusion rates (Bridwell grade) and complications between groups
Heath et al[7], 2024RA-MIS-TLIF vs ON-MIS-TLIFSagittal alignment parameters (LL, SL, PI-LL) not explicitly reportedRA-MIS-TLIF: 0% breach rate (100% grades A/B)No reoperations for screw malposition in either group
Focus on screw accuracy and breach ratesON-MIS-TLIF: 7.89% breach rate (92.1% grades A/B; P < 0.001)
No medial breaches in either group
Chang et al[11], 2022PE RA-TLIF vs MIS-TLIFNot reportedNot reportedScrew accuracy: Robot 0.9% vs fluoroscopy 6.3% (P < 0.05)
Fusion rates: 87.3% (robot) vs 91.8% (fluoroscopy, P = 0.53)
Smaller incisions, less blood loss with robotics
Lai et al[15], 2022MIS-TLIF (robot vs fluoroscopy)Not reportedNot reportedScrew loosening: Robot 4.3% vs fluoroscopy 10.2% (P = 0.049)
Robot screws placed closer to upper endplate (ratio 0.35 vs 0.39, P < 0.001). Loosening linked to age, multilevel fusion, and endplate distance ratio
Zhang et al[20], 2019TLIF with pedicle screwsNot reported for LL/SL/PI-LLRA: 5/176 screws (2.8%) violated facets FG: 24/204 screws (11.8%) (P=0.001)RA achieved higher perfect screw placement (grade A: 93.2% vs 85.8%, P = 0.020)
No severe breaches (grade E) in RA vs 2 in FG
Zhang et al[21], 2019TLIF with percutaneous pedicle screwsNot reported for LL/SL/PI-LLRA: 4/100 screws (4%) violated facets (grades 1-2) FG: 26/100 screws (26%) (grades 1-3) (P < 0.001)RA eliminated severe FJV (grade 3 0% vs 3% in FG)
Larger screw-to-facet distance (4.16 mm vs 1.92 mm, P < 0.001)
Chen et al[22], 2021RA MIS-TLIF vs open TLIFNot reportedNot reportedRA advantages: Higher screw accuracy (92.3% grade A vs 77.4%), faster early pain relief (VAS/ODI at 1 month)
Both groups: Similar 1-year fusion rates (94.2% vs 92.3%)
Cui et al[23], 2021RA-MIS-TLIF vs open TLIFAlignment restored in both groups (no quantitative LL/SL data)RA: 0% (no revisions) vs open: 5% (5 screws revised)RA: Reduced paraspinal muscle atrophy (P = 0.016) at 2-year follow-up
Feng et al[24], 2020RA-OLIF vs freehand OLIFAlignment restored via indirect decompression (no quantitative LL/SL data)RA: 1.8% (3/170 screws breached) vs freehand: 6.9% (12/174 screws breached)RA: Reduced blood loss (P = 0.022) and eliminated postoperative drainage
Li et al[25], 2024RA-MIS-TLIF vs fluoroscopy-MIS-TLIFSagittal alignment parameters (LL, SL, PI-LL) not explicitly reportedRA-MIS-TLIF: 0.13 ± 0.43 FJV grade (90.1% grade 0)Reduced adjacent segment disc height loss (0.63 ± 0.38 mm vs 0.92 ± 0.35 mm; P = 0.001)
Focus on screw accuracy, FJV, and adjacent segment degenerationFluoroscopy: 0.43 ± 0.68 FJV grade (66.1% grade 0) (P < 0.001)Comparable fusion rates (BSF grades; P = 0.522)
Han et al[26], 2021OLIF vs MIS-TLIFNot reportedNot reportedOLIF advantages: Higher disc height (12.4 vs 11.2 mm) and fusion rate (96% vs 87%)
Both groups: Similar screw accuracy (97.3% vs 96.2%)
De Biase et al[27], 2021RA vs FG MI-TLIFNot reportedNot reportedRA advantages: 50% lower radiation dose (31.5 vs 59.5 mGy)
Both groups: 0% screw breaches, similar revision rates (1 vs 2 cases)
Fayed et al[28], 2020RA-PPS (ExcelsiusGPS) vs FG-PPSNot explicitly measured; alignment inferred from screw accuracyRA: 5.8% breaches (4 lateral, 2 grade E) vs FG: 3.3% breaches (1 medial)RA breaches linked to facet hypertrophy; no revisions needed. Short learning curve (1.9% significant breaches after initial cases)
Yang et al[29], 2019MIS-TLIF with percutaneous pedicle screwsScrew insertion angle: RA: 23.8° ± 6.1° vs fluoroscopy: 18.4° ± 7.2° (P = 0.017)RA: 5.1% (grades I-II) fluoroscopy: 15.6% (grades I-III), including 2.1% severe (grade III)RA reduced severe deviations (Neo grade III: 0% vs 3.1%) and improved pedicle screw accuracy (93.8% grade 0 vs 73.8%)
Schatlo et al[30], 2014Lumbar fusion (open/percutaneous)Not reported for LL/SL/PI-LLRA: 8.6% poor trajectory (grades C-E/R)Lateral misplacement most frequent (RA: 47% of deviations; FG: 39%)
FG: 12.9% (grades C-E) (P = 0.09)No difference in clinically acceptable screws (A/B: 91.4% RA vs 87.1% FG, P = 0.19)
Li et al[31], 2024RA MIS-TLIF/UBE-T/PLIFNot reportedNot reportedScrew accuracy significantly better in groups A/B vs C (P < 0.05)
Li et al[32], 2022RA MIS-TLIFNot reportedNot reportedHigher screw accuracy in robot group (P = 0.002)
Table 4 Clinical and patient-reported outcomes
Ref.
VAS/ODI improvement
Revision rate (RA vs comparator)
Fusion success
Notes
Wang et al[1], 2023VAS back: Pre-op 6.92 → 0.90 (RA), 6.78 → 0.71 (FA) at 2 years (P > 0.05)RA: 1 lateral wall violation (adjusted intraoperatively)RA: 88.5% (BSF-3)RA showed fewer facet violations (P < 0.001)
VAS leg: Pre-op 7.70 → 0.54 (RA), 7.56 → 0.44 (FA) (P > 0.05)FA: 1 anterior vertebral perforation (abdominal pain), 1 nerve root irritation (required revision)FA: 85.5% (BSF-3) (P > 0.05)Less disc height loss at adjacent segments in RA (P < 0.001)
ODI: Pre-op 70.90 → 15.23 (RA), 71.00 → 14.89 (FA) (P > 0.05)
Tong et al[2], 2024VAS-back: Significantly lower in robot group at 3 days post-op (2.19 vs 3.18, P < 0.05); no difference at 3/6 months1 case vs 1 caseNot reportedNo complications like infection or dural tear reported in either group
VAS-leg: No significant difference at any time point
ODI: No significant difference at any time point
Shafi et al[3], 2022Not reportedRA: No high-grade breaches (grade E)Not reportedHigher facet violations in RN (5.0% vs 1.3%, P < 0.001), but no clinically significant breaches
ION: 1.2% high-grade breaches (17 screws, P = 0.05)
Griepp et al[4], 2024ODI: Significant improvement in both groups at 6mo (Δ18.6 robot vs Δ18.2 fluoroscopy) and 12mo (Δ20.7 vs Δ22.4), with similar MCID achievement rates (P > 0.05). NRS back pain: Significant improvement in both groups at 6 months (Δ2.8 vs Δ2.3) and 12 months (Δ2.6 vs Δ2.8), with no inter group differences (P > 0.05)RA: 1 revision (infection-related hardware removal). Fluoroscopy group: 3 revisions (2 infections, 1 foraminotomy)HighLow rates in both groups (4.9% overall), with no neurological injuries
Screw malposition: 0 revisions in both groups
Lin et al[5], 2022Similar (P > 0.05)RA: 1.3% intraop (K-wire malposition), 4.0% postop (CSF leak, wound infection) FG: 1.3% intraop (durotomy), 4.0% postop (screw malposition, wound infection) (P = 0.99 for postop surgery-related complications)Not reportedRA reduced pedicle breaches (0.27% vs 1.75%, P = 0.04) and blood loss (P = 0.019)
Li et al[6], 2025VAS-back: 6.3 → 1.8 (RA) vs 6.1 → 1.7 (traditional)0% (RA) vs 0.9% (traditional)90.3% vs 85.7% grade IRA: 4 paresthesias; traditional: 2 paresthesias
ODI: Comparable at 2 years
Heath et al[7], 2024VAS/ODI: Not explicitly reported in the study. Clinical safety was confirmed by maintained neurological status postoperativelyRA: 0 revisions for screw malposition. Navigation group: 0 revisions for screw malpositionHigh (no difference)No medial breaches or neurological complications in either group
Chang et al[11], 2022VAS for back pain: Better in PE RA-TLIF (1.3 ± 0.4) vs MIS-TLIF (2.1 ± 0.1), P < 0.05.ODI: No significant difference (17 ± 5 vs 21 ± 8, P = 0.09)No revisions reported. Misplacement rate: 0.9% (PE RA-TLIF) vs 6.3% (MIS-TLIF), P < 0.05Fusion rate: 87.3% (PE RA-TLIF) vs 91.8% (MIS-TLIF), P = 0.53PE RA-TLIF showed reduced surgical trauma and faster recovery
Lai et al[15], 2022VAS-leg/back and ODI (preop to 12-month): VAS-leg: Preop 8.0 → 0.0 (Ro) vs 0.0 (FG) VAS-Back: Pre-op 8.0 → 2.0 (Ro) vs 3.0 (FG) ODI: Pre-op 57.78 → 26.67 (Ro) vs 28.89 (FG) (all P < 0.05 for improvement; no intergroup differences)Complications: RA TLIF: 24.1% (7/29): 3 screw loosening, 3 cage subsidence, 2 infections. FG TLIF: 24.1% (19/79): 14 screw loosening, 2 cage subsidence, 3 infectionsNot reportedLess screw loosening in RA (P = 0.049)
Revisions: 1 broken rod (FG TLIF)
Zhang et al[20], 2019Not reportedRA: 0% (0/176 screws)Not reportedFJV: RA-PPS: 5 screws vs FG-PPS: 24 screws (P = 0.001)
Comparator: 1.0% (2/204 screws)Blood loss: Reduced in RA group (187.2 mL vs 373.2 mL, P = 0.001
Zhang et al[21], 2019Not reportedRA: 0% (0/100 screws); FG: 1% (1/100 screws)Not reportedFJV: RA: 4% (4/100) vs FG: 26% (26/100) (P = 0.0001)
Severe FJV (grade 3): Only in FG group (3 screws)
Intra-pedicle accuracy (grade A): RA: 85% vs FG: 71% (P = 0.017)
Blood loss: RA: 171.6 mL vs FG: 362.0 mL (P = 0.001)
Chen et al[22], 2021Similar (P > 0.05)None94.2% vs 92.3% (NS)RA showed shorter hospital stay
Cui et al[23], 2021VAS: 6.9 → 2.1 (RA) vs 6.5 → 3.7 (open) (P = 0.004); ODI: Comparable at 2 years0% (RA) vs 5% (open)ComparableRA: 1 transient numbness; open: 1 screw loosening
Feng et al[24], 2020VAS back pain: Immediate post-op: 2.15 (RA) vs 3.35 (comparator) (P < 0.05) ODI: No significant difference between groups at any time pointRA: 0.01% (1/170 screws) comparator: 3.4% (6/174 screws)Not reportedRA group had shorter operative time, less blood loss, and no postoperative drainage
Complications: RA (1 hip flexor weakness); comparator (1 infection, 1 hip flexor weakness, 1 delayed wound healing)
Li et al[25], 2024VAS-back/Leg: No significant difference between groups pre-op, post-op 3 days, or final follow-up (P > 0.05). ODI: No significant difference at any time point (P > 0.05)RA: 1 screw revision (penetrated outer pedicle wall, adjusted intraoperatively). Freehand group: 2 screws revised (penetrated anterior cortex, causing transient abdominal pain; 1 screw irritated nerve root, requiring immediate revision)No significant difference in fusion status (BSF grading) between groups (P > 0.05)Lower FJV in robot group (0.13 grades vs 0.43 grades, P < 0.001). Less adjacent segment disc height loss in robot group (0.63 mm vs 0.92 mm, P = 0.001)
Han et al[26], 2021VAS back pain: Lower in OLIF at 1 week (2.8 vs 3.5, P < 0.05) and 3 months (1.6 vs 2.1, P < 0.05). ODI: Lower in OLIF at 3 months (22.3 vs 26.1, P < 0.05) - no differences in leg pain VASNo revisions reported.Complications: OLIF (7/28) vs MIS-TLIF (5/33), all resolved conservativelyFusion rate: Higher in OLIF (96% vs 87%, P < 0.01). Disc height: Greater in OLIF (12.4 mm vs 11.2 mm, P < 0.01Higher fusion in OLIF with RA OLIF had less blood loss (142.4 vs 291.5 mL, P < 0.01) and shorter hospital stays (3.2 vs 4.2 days, P < 0.01)
De Biase et al[27], 2021Not reportedRevisions: 1/52 (RA, pseudoarthrosis) vs 2/49 (FG, no surgery required)Fusion status: No significant differences (pseudoarthrosis rates: 2% RA vs 4% FG, P = 0.523).Lower radiation in RA group. Operative time, blood loss, hospital stay, and complication rates were similar
Fayed et al[28], 2020Not reportedRA: 0% (0/103 screws); comparator: 1.1% (1/90 screws)Not reportedComplications: No revisions required in RA-PPS group; 1 revision in FG-PPS group (medial breach)
Yang et al[29], 2019Not reportedRA: 0% (0/130 screws); comparator: 3.1% (4/130 screws)Not reportedFJV: RA-PPS: 5.1% (5/98 screws); FG-PPS: 15.6% (15/96 screws) (P = 0.021)
Complications: No severe facet violations (Babu grade III) in RA-PPS group; 2 cases in FG-PPS group
Schatlo et al[30], 2014Not reportedRobot-assisted: 2.5% (6/244 screws revised intraoperatively); FG: 1 revision surgery (for radiculopathy due to screw malposition)Not reportedNeurological injury occurred in 1 FG case (resolved after revision)
No significant difference in infection rates (robot: 1.8%, fluoroscopy: 2.5%)
Blood loss was significantly lower in the robot-assisted group
Li et al[31], 2024Significant improvement at 6 months (P < 0.05), no difference between groups (P > 0.05)A: 0, B: 0, C: 0Not reportedMacnab excellent/good rates: A: 96%, B: 93%, C: 92% (P > 0.05). No difference in complications (P > 0.05)
Li et al[32], 2022No significant difference between groups (P > 0.05)0/33 vs 2/39 (screw reinsertion)Not reportedMacnab excellent rate: 91% (RA) vs 87% (conventional group), P = 0.900. Complications: 9% vs 20%
Table 5 Risk of bias assessment for randomized controlled trial (Cochrane Risk of Bias Tool 2.0)
Ref.
Randomization process
Deviations from intended interventions
Missing outcome data
Measurement of the outcome
Selection of reported result
Overall risk of bias
Feng et al[24], 2020LowLowLowLowLowLow
Table 6 Risk of bias assessment for observational studies (risk of bias in non-randomized studies of interventions)
Ref.
Confounding
Selection bias
Classification of interventions
Deviations from intended interventions
Missing data
Measurement of outcomes
Selection of reported result
Overall risk of bias
Wang et al[1], 2023LowLowLowLowLowLowLowLow
Tong et al[2], 2024ModerateModerateLowLowLowLowLowModerate
Shafi et al[3], 2022ModerateModerateLowLowLowLowLowModerate
Griepp et al[4], 2024ModerateHighLowLowLowLowLowModerate
Lin et al[5], 2022ModerateModerateLowLowLowLowModerateModerate
Li et al[6], 2025ModerateLowLowLowLowLowLowModerate
Heath et al[7], 2024ModerateModerateLowLowLowLowLowModerate
Chang et al[11], 2022ModerateLowLowLowLowLowLowModerate
Lai et al[15], 2022HighModerateLowLowLowLowModerateModerate
Zhang et al[20], 2019HighHighLowLowLowModerateModerateHigh
Zhang et al[21], 2019ModerateModerateLowLowLowLowLowModerate
Chen et al[22], 2021ModerateModerateLowLowLowLowModerateModerate
Cui et al[23], 2021ModerateLowLowLowLowModerateLowModerate
Li et al[25], 2024ModerateModerateLowLowLowLowLowModerate
Han et al[26], 2021HighModerateLowLowLowModerateModerateHigh
De Biase et al[27], 2021HighModerateLowLowModerateModerateModerateHigh
Fayed et al[28], 2020ModerateModerateLowLowLowLowLowModerate
Yang et al[29], 2019ModerateModerateLowLowLowLowLowModerate
Schatlo et al[30], 2014HighModerateLowLowLowLowLowModerate
Li et al[31], 2024ModerateLowLowLowLowLowLowModerate
Li et al[32], 2022ModerateLowLowLowLowLowLowModerate
Table 7 Summary of findings and Grading of Recommendations Assessment, Development and Evaluation evidence quality assessment
Outcome
Number of studies
Study design(s)
Risk of bias
Inconsistency
Indirectness
Imprecision
Publication bias
Overall quality (GRADE)
Comments/explanation
Screw accuracy (grade A)1 RCT, 21 observationalsRCT, observationalModerateNot seriousNot seriousNot seriousUndetectedModerateConsistent large effect across multiple studies despite observational nature
Operative time1 RCT, 17 observationalsRCT, observationalModerateSeriousNot seriousSeriousUndetectedLowRisk of confounding and moderate heterogeneity across studies (I2 = 66%)
Blood loss1 RCT, 17 observationalsRCT, observationalLowNot seriousNot seriousNot seriousUndetectedLowMultiple studies had critical ROBINS-I domains; small-to-moderate effect size
FJV1 RCT, 12 observationalsRCT, observationalModerateNot seriousNot seriousNot seriousUndetectedModerateRA reduced FJV rates by 50%-75%
Sagittal alignment0 RCT, 7 observationalsObservationalModerateSeriousNot seriousSeriousUndetectedLowLimited data; heterogeneous measurements
Patient-reported outcomes1 RCT, 14 observationalsRCT, observationalModerateNot seriousNot seriousNot seriousUndetectedModerateNo long-term differences between RA and comparators