The literature was reviewed to evaluate the compliance of randomized clinical trials (RCTs) with the CONsolidated Standards of Reporting Trials (CONSORT ) and the risk of bias of these studies through the Cochrane Collaboration risk of bias tool (CCRT). RCTs were searched at Cochrane Library, PubMed, and other electronic databases to find studies about adhesive systems for cervical lesions. The compliance of the articles with CONSORT was evaluated using the following scale: 0 = no description, 1 = poor description, and 2 = adequate description. Descriptive analyses about the number of studies by journal, follow-up period, country, and quality assessments were performed with CCRT for assessing risk of bias in RCTs. One hundred thirty-eight RCTs were left for assessment. More than 30% of the studies received scores of 0 or 1. Flow chart, effect size, allocation concealment, and sample size were more critical items, with 80% receiving a score of 0. The overall CONSORT score for the included studies was 15.0 ± 4.8 points, which represents 46.9% of the maximum CONSORT score. A significant difference among countries was observed (p<0.001), as well as range of year (p<0.001). Only 4.3% of the studies were judged as at low risk; 36.2% were classified as having unclear risk and 59.4% as having high risk of bias. The adherence of RCTs evaluating adhesive systems to the CONSORT is low with unclear/high risk of bias.
Due to the development of adhesive systems, macro-mechanical retention is no longer essential. The use of adhesive systems allows good retention of restorative materials without the need for macro-mechanical retention. This might explain the rapid evolution and release of several commercial adhesive formulations. Etch-and-rinse adhesives, which require preliminary removal of the smear layer, are offered in two and three steps. Self-etch adhesives, capable of simultaneously demineralizing and infiltrating the dental substrates, are sold in one or two clinical steps. More recent and versatile systems, named as universal systems, can either be used in an etch-and-rinse or self-etch mode.
Despite the benefits that adhesive systems have made possible, clinicians are exposed to adhesives that use different bonding strategies with different levels of simplification. To make things more complicated, for each one of these combinations, a high number of commercial brands are available.
Laboratory testing is a very useful method for comparing the bonding performance of adhesive systems, but thus far, authors of few studies have found any correlation of their results with clinically important outcomes. On the other hand, clinical trials can provide reliable and direct evidence to guide clinicians to choose dental materials. The comparison of bonding techniques and adhesive systems is usually performed with noncarious cervical lesions (NCCLs), as these lesions lack macro-mechanical retention and therefore restoration loss is due to ineffective bonding, which is an objective and clinically important outcome for adhesive efficacy.
Randomized controlled trials (RCTs) represent the standard design for evaluation of health care interventions. Well-designed RCTs and systematic reviews of well-designed RCTs are on the top of the hierarchy of the levels of evidence. However, RCTs can yield biased results if they lack methodologic rigor.1 Problems with the design and execution of RCTs raise questions about the validity and reliability of their findings that can end up with an underestimation or overestimation of the true intervention effect.2,-4
In this way, one should appraise the quality of RCTs before any clinical decision making. This assessment depends on a good reporting/writing of the methods and results sections of the RCTs. In an attempt to standardize the reporting, a group of experts joined together in 1996 and produced the CONSORT statement,5 which is a checklist with recommendations for reporting of clinical trials in biomedical literature. This CONSORT statement was revised in 2001,6 and the most recent one was published in 2010.7,8
The compliance of RCTs with the CONSORT statement7,8 was evaluated in several specialties of medicine,9,10 as well as in some areas of dentistry, such as implantology, prosthodontics,11,12 periodontology,13 orthodontics,14,-16 and pediatric dentistry.17 Given the importance of RCTs in NCCLs for decision making during restorative procedures, the aim of this study was to systematically review the literature in peer-reviewed journals to evaluate 1) the compliance of recent RCTs with the CONSORT statement and 2) the risk of bias of these studies through the Cochrane Collaboration risk of bias tool (CCRT).
METHODS AND MATERIALS
This study was not registered a priori as no known register currently accepts protocols for methodology of systematic reviews.
The following electronic databases were used to identify eligible studies: Cochrane Library, MEDLINE via PubMed, EMBASE, Latin American and Caribbean Health Sciences Literature (LILACS) database, and the Brazilian Library in Dentistry (BBO). Citation databases such as Scopus and Web of Science (Table 1) were also searched. Additionally, the reference lists of all primary studies were searched for additional relevant publications, as well as the first page of the related articles' links to each primary study in the PubMed database. Articles in Japanese, Chinese, Arabian, and other Eastern languages were not included due to difficulties in the translation process.
The search strategy was first prepared for the MEDLINE database by using controlled vocabulary (MeSH terms) and free keywords. Then, the search strategy was adapted to the other electronic and citation databases (Table 1). Only studies published in 1996 or later were included. This time period was chosen because the CONSORT statement was first published in 1996, and hence it would be unfair to expect that RCTs prior to this year would adhere to a standard that did not exist at the time of writing. Gray literature was not addressed because the study objective was to evaluate studies published in peer-reviewed journals.
Parallel and split-mouth RCTs that evaluated the performance of adhesive systems, restorative materials, or restorative and technique protocols in NCCLs of adult patients of any age group were included. RCTs should have at least two comparable groups, in which one of the groups was testing an adhesive system.
Articles could be excluded 1) if a clinical study did not perform a clinical evaluation, but rather was a laboratory evaluation; 2) if a study evaluated techniques for management of dentin hypersensitivity; 3) if there were conference abstracts, theses, or reports published in any media different from peer-reviewed journals; and 4) if studies were published earlier than 1996.
Initially, the articles were selected by title, and abstracts and duplicates were removed. Full-text articles were obtained, and subsequently, three reviewers (J.G., L.W., and A.R.) classified those that met the inclusion criteria.
Adherence to CONSORT Statement
An evaluation tool based on the items related to the methods and results from the 2010 CONSORT statement was developed7,8 to evaluate the reporting completeness of RCTs (Table 2). A total of 12 items of the CONSORT were included in this CONSORT evaluation tool. As some of these items were subdivided, a total of 16 items were evaluated. The given score per item ranged from 0 to 2. In other words, 0 = no description, 1 = poor description, and 2 = adequate description. More details about the scoring process are found in Table 2. Each item was given equal weighting.
Before evaluation, the instrument was discussed between two experienced authors in clinical trials (A.D.L. and A.R.), pilot tested in 20 articles, and checked for accuracy and reproducibility by two evaluators. This process yielded modification of the instrument tool, as new possibilities for each score were observed and discussed during pilot testing.
A single author (A.R.) performed the round of scoring using the CONSORT evaluation tool (Table 2), and only in case of doubt, a second author (A.D.L.) was contacted for discussion and final decision. Evaluators were not blinded to the study authors. This would not be possible as authors were familiar with the studies and could guess the researcher center by reading the paper.
Scoring System and Statistical Analysis
Descriptive analyses about the number of studies by journal, follow-up period, and country were described. Compliance with individual items of the CONSORT statement was analyzed to determine what clinical researchers should improve in their description. To do this, the percentage of studies per score in each item was provided in a chart.
To achieve an overall compliance score per article, the scores of the 16 items were summed. A trial with complete adequate descriptions (score 2) in all CONSORT items would receive a maximum score of 32. An average score was calculated by period of time, journal, and country. Comparison within each factor was performed with the Kruskall-Wallis and Mann-Whitney tests at a level of confidence of 95%. Linear correlation analysis between 2015 International Scientific Index (ISI) journal impact factor and the average CONSORT score was also performed.
Risk of Bias in Individual Studies
Quality assessments were performed by two independent reviewers, using the Cochrane Collaboration's tool for assessing risk of bias in RCTs.18 The assessment criteria contained six domains: sequence generation, allocation concealment, blinding of the outcome assessors, incomplete outcome data, selective outcome reporting, and other possible sources of bias.
For each aspect of the quality assessment, the risk of bias was scored following recommendations of the Cochrane Handbook for Systematic Reviews of Interventions 5.1.0 (http://handbook.cochrane.org). At the study level, the study was considered at low risk of bias if all domains received the same judgment. If at least one domain was judged as at unclear risk, the study was considered as having unclear risk of bias. On the other hand, if at least one domain was judged at high risk of bias, then the study was also at high risk of bias. During data selection and quality assessment, disagreements between reviewers were solved through discussion.
Characteristics of the Included Studies
From a total of 2191 screened articles, 2031 were excluded for not meeting the inclusion criteria. The full texts of 160 papers were obtained and assessed, and 22 papers were excluded for the following reasons: 1) 10 studies were not RCTs; 2) four studies compared only glass ionomer cements; 3) two studies were duplicates; 4) one study performed replica rather than clinical evaluation; 5) one study was an abstract; 6) one study was in the Chinese language; 7) one study was performed in vitro; 8) one study was performed in class I and II restorations; and 9) one study evaluated only desensitizers. After these exclusions, 138 RCTs were left for final assessment (Figure 1).
The included RCTs investigated several issues. Study authors usually compared 1) patient-related (eg, dentin sclerosis) and operator-related factors (eg, clinical experience); 2) different adhesive systems for bonding and desensitization; 3) different restorative materials; 4) curing methods, and 5) composite-resin-based vs glass ionomer and/or resin-modified glass ionomer cements. In some studies, more than one of these variables were evaluated.
Table 3 displays the 138 RCTs tabulated by their collected characteristics. The journals contributing with the most RCTs were Operative Dentistry (17.4%), followed by American Journal of Dentistry (12.3%), Clinical Oral Investigations (10.1%), and Journal of Dentistry (10.1%). Approximately 26.9% of the publications were published in 16 different journals. The countries with most publications were Brazil (31.2%) and the United States (18.1%), representing together approximately 50% of all publications in the field. An increase in the number of articles is occurring over time, but unfortunately, more than half (62.3%) of the publications are of short-term duration (6 months to 2 years).
Study Compliance With Each of the CONSORT Instrument Tool Items
Figure 2 displays the percentage of studies in each item of the CONSORT Statement. Regarding the item numbers analyzed, losses/exclusions, eligibility criteria, and intervention, approximately 70% of the studies were scored as 2, meaning adequate reporting of these items.
In all other items, more than 30% of the studies received a score of 1 (poor reporting) or a score of 0 (no report). This was more critical in the item's protocol, flow chart, effect size, allocation concealment, and sample size, where more than 80% of the studies were scored as 0 (no report).
Average CONSORT Score per Study Characteristics
The overall CONSORT score for the included studies in this review was 15.0 ± 4.8 points, which represents 46.9% of the maximum CONSORT score of 32 points. No influence of the journal on the average CONSORT score was observed (p=0.198; Table 4). Correlation between journal impact factor and overall CONSORT score (r=0.089; p=0.93; Figure 3) was lacking. On the other hand, significant differences among countries were observed (p<0.001), with the average CONSORT score of Brazil being statistically higher than Egypt and Germany. Similarly, the range of year had a significant influence on the average CONSORT score. An increase in the average CONSORT score in recent years was observed (p<0.001; Table 4). In all other comparisons, no significant difference was detected. The individual CONSORT score for each of the included studies can be seen in Table 5.
Risk of Bias of the Included Studies
Except for the selective outcome reporting and incomplete outcome data, most of the studies were judged as unclear or at high risk of bias in the Cochrane Collaboration tool domains (Figure 4). For the new domain included by the review authors (experimental unit), the percentage of studies at high risk of bias was even higher than the other domains (Figure 4).
Table 5 reports the individual risk of bias in each domain for all included studies. This table allows the analysis of the risk of bias within studies. Only six included studies (4.3%) were judged to be at low risk of bias in all domains. Fifty studies had unclear risk of bias in at least one domain, resulting in 36.2% of the studies being classified as having unclear risk of bias. The remaining 82 studies were at high risk of bias in at least one domain, representing 59.4% of studies at high risk of bias.
A very comprehensive search was performed, including different electronic databases and using controlled vocabulary and keywords for each of the concepts of the search. However, one cannot deny that some articles might have been missed during the search process. It is likely, however, that missed articles represent a small percentage of the included studies and, if there are any, they are unlikely to change the results presented herein.
Study Compliance With the CONSORT
The reporting quality of RCTs of adhesive systems placed in the NCCLs was assessed using an instrument tool, which was elaborated based on the CONSORT statement.7,8 Different from earlier studies on the same topic,11,-13,15,-17 the items related to the title and abstract, introduction, and discussion were not evaluated because these items are very subjective, and the study adherence to these items does not weaken either the quality of the study or their risk of bias.
The CONSORT statement reports only the items that should be addressed, but the instrument herein developed allows each item of the CONSORT statement to be scored as either 0 (no report), 1 (poor reporting), or 2 (adequate reporting), based on the detailed descriptions of what should be observed in each item. This allowed a better reproducibility of the scoring process and may aid researchers to better understand what and how data should be described in future RCTs of the bonding area.
The present study observed that most of the included articles did not strictly follow the CONSORT statement. On average, a study compliance of only 46.9% with the evaluated CONSORT items was observed. An increased compliance with the CONSORT statement was observed in the last 6 years (mean CONSORT score of 17.9 ± 5.0; 49% compliance), a finding already observed by other authors.14,15 However, this increase is still trivial and substandard because it reached approximately a little more than half of the maximum CONSORT score of 32 points.
Compliance with the CONSORT statement has already been studied in other fields of dentistry. In the orthodontic area, studies reported a compliance of 41.5%,15 51.7%,14 and 68.9%.16 In the fields of prosthodontics and implant dentistry, a compliance of approximately 68% was observed.11 Variations within the same area are likely related to the inclusion criteria of the studies, mainly regarding their period of publication. Additionally, variations in the approach used to evaluate the CONSORT compliance can yield discrepancies in the results. However, regardless of these variations, one may see that even in the best situation the compliance was still low, indicating need of improvement.
It has already been reported that journal endorsement of the CONSORT statement might beneficially influence the completeness of RCTs reporting in medical journals10 and in orthodontic dentistry journals.15,19 Although some of the main journals that published studies of adhesives in NCCLs endorsed the CONSORT Statement (ie, Journal of the American Dental Association, Journal of Dentistry, and American Journal of Dentistry), a journal and its impact factor did not influence the average CONSORT score, neither in the present study nor in a systematic review in medicine.9 Sarkis-Onofre and others20 recently confirmed no correlation exists between journal endorsement of the CONSORT statement with improved completeness of RCTs reporting in restorative dentistry. Perhaps editors and editorial boards from these journals do not check the submitted articles against the CONSORT statement, which prevents the journals from reaching the expected benefits. More attention to these items during the peer-review process is required.
As reported in the results section, the item's sample size, allocation concealment, effect size, flow chart, and protocol were the aspects with poorest reporting. A priori sample size calculation prevents the publication of underpowered RCTs. In underpowered studies, negative findings do not necessarily mean the groups are not different from one another; it may be the result of sample size being too small to detect a “clinically important difference” among the groups.
A study should involve a sample size large enough to have a high probability (power) of detecting as statistically significant a clinically important difference of a given size, if such a difference exists. For such a purpose, and in superiority trials, authors should describe 1) the estimated outcomes in each group for the primary outcome(s) (ie, the clinically important difference between groups); 2) type I error; 3) power; and 4) for continuous outcomes, the standard deviation of the measurements.
In the present study, approximately 82% of the RCTs did not report sample size calculation at all. This is also problematic in the medical field. For instance, Chan and Altman21 reported that 73% of the 519 medical trials indexed in PubMed in December 2000 did not report sample size calculation. To make the scenario even worse, authors usually do not report the primary outcome for which the sample size calculation was performed. In this review, only 30% of the included RCTs reported the primary outcomes of the study clearly. Although, the United States Public Health Service evaluation22 and more recently the Fédération Dentaire Internationale criteria23 contain several criteria to be evaluated, in the case of RCTs about adhesive systems in NCCLs, retention rate should be regarded as a primary outcome and used for sample size calculation for being a true end point.
The reporting of the randomization process should include details about the methods used to generate the random sequence. In this review, it was observed that this item was reported inadequately, or it was not reported at all in 63.8% of the cases. In the fields of prosthodontics and implant dentistry, this figure was 44.3%.11 Usually, authors refer to terms such as “random allocation” or “the groups were randomized,” without further elaboration. Authors should specify the method of sequence generation (such as a random number table or a computerized random number generator, coin toss, and dice throwing), as well as restrictions to the process such as stratification and block randomization.
Allocation concealment seeks to prevent foreknowledge of the sequence generation before implementation, and it is as important as sequence generation to prevent selection bias. Allocation concealment can always be successfully implemented. It should not be confused with blinding, as blinding prevents performance and detection bias.24 Despite the importance of allocation concealment, one can observe in 89.1% of the cases that there was no description of this item at all. This is also in agreement with previous literature findings. An inadequacy of allocation concealment description was observed in 78% of the RCTs among dental journals25 and 93% in the specialty of periodontology.13 Another problem related to inadequately and unclearly concealed RCTs is that effect sizes are exaggerated in favor of the experimental group.4
Blinding is also a key element in RCT reporting. In the present review, 70% of the RCTs performed poor or no reporting of blinding. During the execution of RCTs in NCCLs about adhesive systems or composite resins, operator blinding is quite impossible. However, patient and evaluator can still be blinded. If the primary outcome is retention rate, which is an objective parameter, lack of evaluator blindness does not put the study at high risk of bias, but for other subjective criteria such as marginal discoloration, marginal adaptation, color match, and others, the lack of evaluator blinding puts the study at high risk of bias. Patient blinding is especially important when patient-centered subjective outcomes such as pain scores are collected, as they are more prone to bias. This is the case when different desensitizers are evaluated in NCCLs. In summary, blinding of the patients and the treatment providers may not always be possible; however, blinding of the evaluators and the analysts may.
One of the common failures during reporting of blinding is that authors usually report “this study was single-blind” or “this was a double-blind study,” without reporting who was blinded; this should be clearly stated in the RCTs. In agreement with these findings, Pandis and others25 reported that inadequate description of blinding in RCTs published in leading dental journals ranged from 74% to 100%. In implant dentistry, the lack of adequate blinding reporting was informed to be 58%.26
Reporting of effect size and confidence intervals facilitates interpretation of important clinical differences. Hypothesis testing with p values and statistical significance is based on arbitrary cutoff points (ie, 0.05) and are sensitive to sample size and variance. By increasing sample size, very small and unimportant clinical differences may become statistically significant and may be erroneously interpreted as being “clinically” important.24
In this study, 92.8% of the RCTs did not describe any effect size for at least the primary outcome. This is also a problem in medical journals.27 Authors should report an estimate of the treatment effect, which is a contrast between the outcomes in the comparison groups. For binary outcomes, the effect size could be the risk ratio (relative risk), odds ratio, or risk difference; for survival time data, it could be the hazard ratio or difference in median survival time; and for continuous data, it is usually the difference in means or standardized difference in means. Confidence intervals should be presented as they provide information about data precision.
The lack of description of effect sizes suggests that authors still rely on hypothesis testing for group comparisons. Researchers are advised to move away from significance tests to effect size reporting, delimited by confidence intervals. This method incorporates all the information normally included in a hypothesis but in a way that emphasizes the size of the difference (clinical significance rather than statistical significance).27,28
The design and conduct of some RCTs may be not straightforward, particularly when there are losses to follow-up or exclusions. This prevents the description of the numbers of participants through each phase of the study in a few sentences. In complex studies, it may be challenging for readers to discern whether and why some participants did not receive the treatment as allocated or if they were lost to follow-up or were excluded from the analysis.29 This can be simply described by introducing a flow chart with the number of participants in each phase of the trial. Although the CONSORT Statement recommends the inclusion of a flow chart, only 13% of the RCTs herein evaluated followed this recommendation.
Another type of bias commonly faced in RCTs is selective outcome reporting. As pointed out in an editorial by de Angelis and others,30 researchers (and journal editors) are generally most enthusiastic about the publication of RCTs that show either a large effect of a new treatment (positive trials) or equivalence of two approaches to treatment (noninferiority trials). Less excitement is observed in RCTs that show that a new treatment is inferior to standard treatment (negative trials), and researchers show even less interest in RCTs that are neither clearly positive nor clearly negative because inconclusive RCTs will not, by themselves, encourage changing practices. Additionally, sponsored RCTs are likely to remain unpublished if the results of the RCTs place financial interests at risk.30
To manage such problems, the International Committee of Medical Journal Editors (ICMJE) proposes comprehensive trials registration. Trials must register at or before the onset of patient enrollment.30 For the ICMJE, this policy applies to any clinical trial that started enrollment after July 1, 2005. However, only 4 of 110 included studies of this review published in 2005 or later performed trial registration (Table 5). Authors are advised to perform trial registry due to its advantages: 1) selective reporting can be avoided and if present, could be checked by comparing the published version of the paper with their registered protocol; and 2) it reduces publication bias, as studies with negative or inconclusive findings would be available for evaluation. Some dentistry journals such as Journal of Dentistry and Operative Dentistry have added this indication as mandatory in their instructions for authors.
Other items of CONSORT such as numbers analyzed, baseline data, losses and exclusions, outcomes, setting, and trial design deserves some discussion. Regarding numbers analyzed, the number of participants per group in all analyses should be clear in the study. Reporting summary statistics or only percentages, relative risks, or odds ratios is not enough as they do not allow readers to assess whether some of the randomly assigned participants were excluded from the analysis. The same should be applied to losses and exclusions. Along with the description of these figures per group, reasons for the losses and exclusions should be given as they may be related to the intervention. For instance, when a patient moves to another city, it is unlikely to be related to the intervention, but if a patient does not attend the recalls because he or she wants to be withdrawn from the trial, then the reason may be related to side effects or lack of efficacy of the treatments under evaluation.
Baseline information, adequately reported in only 20.3% of the papers, allows readers to check whether groups are comparable at baseline. Although proper random assignment prevents selection bias, it does not guarantee that the groups are equivalent at baseline. Any differences in baseline characteristics are, however, the result of chance rather than bias: the reason why there is no need to perform hypothesis testing for these characteristics. For instance, in the case of RCTs in NCCLs, the presence of occlusal wear facets is considered a predictive factor for restoration loss. The number of restorations placed in teeth with or without occlusal wear facets per group is therefore essential for baseline evaluation.31
For all three of these items (numbers analyzed, losses and exclusions, and baseline characteristics), authors should be careful when presenting data. First, displaying percentages instead of raw numbers is risky. Rounded percentages may be compatible with more than one numerator and if authors fail to provide the number analyzed, then the denominator (total number of participants evaluated) will be unclear. For instance, 50% may represent five of 10 but also 500 of 1000. Second, merged data of groups can be provided as long as their individual values are also reported. Third, continuous variables should be presented as means and standard deviations (or standard errors) or medians and interquartile ranges (when not normally distributed); dichotomous variables in number of counts versus total number of observations.
The trial design involves the description of type of the trial (parallel, cross-over, factorial, split-mouth, and/or multiple restorations); the conceptual framework (superiority, noninferiority, or equivalence trial); and the allocation ratio (eg, 1:1 or 1:2). The setting (where and when the study was performed) is also essential to place a study in historical context and to evaluate its external validity (generalization of the findings to other populations).
Risk of Bias
Except for incomplete outcome data and selective reporting, which is not a major problem in the included articles of the present studies, in all other domains of the CCRT, RCTs were judged to have unclear or high risk of bias. The implications of inadequate sequence generation, allocation concealment, and examiner blinding were already discussed in detail.
We also added another domain in the CCRT for the analysis of the risk of bias, which is the experimental unit. The great majority of the authors placed multiple restorations per patient and considered each tooth as an experimental unit, without taking into consideration the clustered nature of the data. In these cases, authors applied conventional hypothesis testing statistics that assume that data are “independent.” Treating multiple observations from one participant as independent data is a serious error. Having this in mind, authors are advised to 1) place a single restoration per group in each patient in a paired design; 2) place more than one restoration per group in each patient, but only one value (median, mean, worst score, etc) per patient/group should be statistically analyzed; or 3) place multiple restorations per patient but use more advanced statistical models to account for the paired nature of the data.
In general, only 4.3% of the studies were considered at low risk of bias in this item. Most of the studies (59.4%) were at high risk of bias, and this affects the quality of the body of evidence produced thus far.
Although some journals have adopted the CONSORT guidelines in the instructions for authors, active compliance is yet to be achieved. Perhaps the inclusion of additional subheadings for RCTs, as suggested by Kloukos and others,11 could result in better compliance with the CONSORT statement. The results of the present study indicate that adherence of RCTs that evaluate adhesive systems in NCCLs to the CONSORT statement requires improvements. Adherence to the CONSORT statement will also reduce the high risk of bias of studies in the field.
This study was partially supported by CAPES and National Council for Scientific and Technological Development (CNPq) under Grants 304104/2013-9 and 305588/2014-1.
Conflict of Interest Declaration
The authors of this manuscript certify that they have no proprietary, financial, or other personal interest of any nature or kind in any product, service, and/or company that is presented in this article.