Artificial intelligence based writer identification generates new evidence for the unknown scribes of the Dead Sea Scrolls exemplified by the Great Isaiah Scroll (1QIsaa)
Mladen Popović1\Yinyang*, Maruf A. Dhali 2\Yinyang, Lambert Schomaker2\Yinyang
1 Qumran Institute, Faculty of Theology and Religious Studies, University of Groningen, Groningen, The Netherlands
2 Department of Artificial Intelligence, Faculty of Science and Engineering, University of Groningen, Groningen, The Netherlands
These authors contributed equally to this work.
* Corresponding author’s address: Qumran Institute, Oude Boteringestraat 38, 9712 GK Groningen, The Netherlands. Email: firstname.lastname@example.org
The Dead Sea Scrolls are tangible evidence of the Bible’s ancient scribal culture. Palaeography—the study of ancient handwriting—can provide access to this scribal culture. However, one of the problems of traditional palaeography is to determine writer identity when the writing style is near uniform. This is exemplified by the Great Isaiah Scroll (1QIsaa). To this end, we used pattern recognition and artificial intelligence techniques to innovate the palaeography of the scrolls regarding writer identification and to pioneer the microlevel of individual scribes to open access to the Bible’s ancient scribal culture. Although many scholars believe that 1QIsaa was written by one scribe, we report new evidence for a breaking point in the series of columns in this scroll. Without prior assumption of writer identity, based on point clouds of the reduced- dimensionality feature-space, we found that columns from the first and second halves of the manuscript ended up in two distinct zones of such scatter plots, notably for a range of digital palaeography tools, each addressing very different featural aspects of the script samples. In a secondary, independent, analysis, now assuming writer difference and using yet another independent feature method and several different types of statistical testing, a switching point was found in the column series. A clear phase transition is apparent around column 27. Given the statistically significant differences between the two halves, a tertiary, post-hoc analysis was performed by visual inspection of character heatmaps and of the most discriminative fraglet sets in the script. Demonstrating that two main scribes were responsible for the Great Isaiah Scroll, this study sheds new light on the Bible’s ancient scribal culture by providing new, tangible evidence that ancient biblical texts were not copied by a single scribe only but that multiple scribes could closely collaborate on one particular manuscript.
Ever since their modern discovery, the Dead Sea Scrolls are famous for containing the oldest manuscripts of the Hebrew Bible (Old Testament) and many hitherto unknown ancient Jewish texts. The manuscripts date from the 4th century BCE to the 2nd century CE. They come from the caves near Qumran and other Judaean Desert sites west near the Dead Sea, except for Wadi Daliyeh which is north of Jericho . Among other things, the scrolls provide a unique vantage point for studying the latest literary evolutionary phases of what were to become the Hebrew Bible. As archaeological artifacts, they offer tangible evidence for the Bible’s ancient scribal culture ‘in action’.
A crucial but yet hardly used entry point into the Bible’s ancient scribal culture is that of individual scribes . There is, however, a twofold problem with putting this entry point of individual scribes into effective use. Except for a handful of named scribes in a few documentary texts [3, 4], the scribes behind the scrolls are anonymous. This is especially true for the scrolls from Qumran, which, with almost a thousand manuscripts of mostly literary texts, represents the largest find site.
The next best thing to scribes identified by name is scribes identified by their handwriting. Although some of the suggestions for a change of scribal hands in a single manuscript or scribes who copied more than one manuscript [5, 6, 7, 8] have met with broader assent, most have not been assessed at all. And estimations of the total number of scribes [9, 10, 11], an argument in the discussion about the origin of the scrolls from Qumran [4, 12, 13, 14], have been, at best, educated guesses.
One of the main problems regarding traditional palaeography of the Dead Sea Scrolls, and also for writer identification in general [15, 16], is the ability to distinguish between variability within the writing of one writer and similarity in style—but with subtle variations—between different writers. On the one hand, scribes may show a range in a variety of forms of individual letters in one or more manuscripts. On the other hand, different scribes may write in almost precisely the same way, making it a challenge to determine the individual scribe beyond similarities in the general style.
The question is whether perceived differences in handwriting are significant and the result of there being two different writers or insignificant because they are the result of normal variations within the handwriting of the same writer. The problem with knowing which differences are likely to be idiographic, and thus significant, is that, in the end, this also involves using implicit criteria that are experience-based [15, 17]. In this regard, although they work according to differing methodologies [15, 17, 18], there is no difference between professional forensic document examiners and palaeographers. The problem is also how one can convince others [19, 20], whether through pictorial form, verbal descriptions, palaeographic charts or a combination thereof.
The Great Isaiah Scroll from Qumran Cave 1 (1QIsaa) exemplifies the lack of a robust method in Dead Sea Scrolls palaeography for how to determine and verify writer identity or difference, especially when the handwriting is near uniform. The question for 1QIsaa is whether subtle differences in writing should be regarded as normal variations in the handwriting of one scribe or as similar scripts of two different scribes and, if the latter, whether the writing of the two scribes coincides with the two halves of the manuscript. The scroll measures 7.34 m in length, averages 26 cm in height, and contains 54 columns of Hebrew text. There is a codicological caesura between columns 27 and 28 in the form of a three-line lacuna at the bottom of column 27. In the second half of the scroll the orthography and morphology of the Hebrew is different and there are spaces left blank.
Scholars have perceived an almost uniform writing style throughout the manuscript of 1QIsaa [21, 22], yet also acknowledged that different scribes could have shared a similar writing style [23, 24]—the script type is called Hasmonaean in the field, the style of writing is formal, and the manuscript is traditionally dated to the late 2nd century BCE. Accordingly, some have argued that two scribes were each responsible for copying half of the manuscript, columns 1–27 and columns 28–54 [5, 25]. But most scholars have argued or assumed that the entire manuscript was copied by one scribe, with minor interventions by other, contemporaneous and also much later, scribes [26, 27, 28], and that differences between the two halves should be explained otherwise, for example, by assuming that two separate and dissimilar Vorlagen were used or that the Vorlage for the second half was a damaged manuscript [29, 30, 31, 32, 33, 34, 35, 36, 37].
No one, however, has provided detailed palaeographic arguments for writer identity or difference in 1QIsaa, except for  who provided a palaeographic chart to argue for one main scribe. But the palaeographic chart in  is insufficient to demonstrate this for at least three reasons (additional details about the supposed scribal idiosyncrasies are provided in the supplementary material A). Having been electronically produced it is unclear where, and how exactly, the characters were taken from. It is unclear whether “the typical form of the letters” is deemed typical because it is the most common form or because it is idiographic, understood as a subtle variation in graphic form that gives evidence of individuality . Finally, the crucial question is how large amounts of data were processed to generate the chart. The number of instances of a specific Hebrew letter may run in the thousands in 1QIsaa.
Here, pattern recognition and artificial intelligence techniques can assist researchers by processing large amounts of data and by producing quantitative analyses that are impossible for a human to perform. Over the years, within the field of pattern recognition, dedicated feature extraction techniques have been proposed and studied in identifying writers. By extracting useful quantitative data that is writer specific, these techniques are used on handwritten documents to produce feature vectors. In one of our earlier studies, we have tested both textural-based and grapheme-based features on a limited number of scrolls to identify scribes . Textural-based features use the statistical information of slant and curvature of the handwritten characters. Grapheme-based features extract local structures of characters and then map them into a common space, similar to the so-called bag-of-words approach in text analysis .
We have already shown that extracting Hinge, a textural feature operating on the microlevel of handwriting, can be useful in identifying writers . In the process of producing character shapes, writers subconsciously slow down and speed up their hand movements. For example, a bend within a character is an indication of where a slowing down took place, and the sharper the bend the greater the deceleration of the hand movement. Hinge uses this intuitive information between the static space and dynamic time to produce a feature vector.
Similar to the textural features, allographs (prototypical character shapes) can also be useful for writer identification . Allographs can be obtained from either the full characters or from part/s of the characters. We have already worked with full characters and used them to create a codebook of the Dead Sea Scrolls characters for style development analysis .
The quantitative evidence is additional evidence that can stimulate palaeographers to explicate their qualitative analyses [20, 43]. Pattern recognition and artificial intelligence techniques do not give certainty of identification but they give statistical probabilities that can help the human expert understand and also decide between the likelihood of different possibilities.
The evidence from pattern recognition methods can be presented in numbers (quantification of distance; the choice of distance measures plays an important role) but also, more helpfully, in two- or three-dimensional visualizations. Also, so-called Kohonen self-organizing feature maps (see Fig.1) and heatmaps may prove important for detecting a typical style of a letter (a centroid) that is the computed average of all particular instances that were most similar to it. Although such a centroid statistically is a reliable attractor for shapes that look like it, its visual pattern may not consist of a particular canonical or idealized form. Inspection of the individual instances belonging to a centroid (i.e., its members) will reveal the characteristics of that cluster of shapes. Such analyses may supplement exhaustive letter-by-letter analysis.
Our research demonstrates that two main scribes can be identified in 1QIsaa and that they coincide with columns 1–27 and columns 28–54. This study illustrates the advantage of the innovative use of robust pattern recognition and artificial intelligence techniques for writer identification in the Dead Sea Scrolls when dealing with an almost uniform writing style that makes it difficult, if not near impossible, for researchers to assess writer identity or difference. Moreover, we show that procedures for cross-examination [17, 20] and falsification are in place by statistical and post-hoc visual analyses. Bridging artificial intelligence and traditional palaeography, our post-hoc visual analyses go beyond the state of the art by correlating the quantitative analyses to a level suitable for researchers to be able to see what the computer ‘sees’, enabling a new way of looking at palaeographic evidence. Also, our analysis is fully automatic. We have no need to apply a semi-automatic first step of character reconstruction as in [44, 45, 46] that aim to imitate the ancient reed pen’s movement, although it seems more likely that the stiff-flexible fibrous tip of the sea-rush stem must have been used, like in Egypt . We have developed robust and sufficiently delicate binarization and extraction methods and have succeeded at extracting the ancient ink traces as they appear on digital images . This is important because the ancient ink traces relate directly to a person’s muscle movement and are person specific. For writer identification one should ideally work with the original written content only. The pattern recognition and artificial intelligence techniques should therefore be capable of focusing on the original written content only. Neither should it depend on modern character reconstructions.
In a way that was not possible before, our approach opens access to the tangible evidence of the hitherto almost completely denied microlevel of the individual scribes of the Dead Sea Scrolls and the possibility to examine the different compositions copied by each of the scribes. The change of scribal hands in a literary manuscript or the identification of one and the same scribe in multiple manuscripts can be used as evidence to understand various forms of scribal collaboration that otherwise remain unknown to us. The number of literary manuscripts on which a scribe worked, either alone or with others, can serve as tangible evidence for understanding processes of textual and literary creation, circulation, and consumption. Together with other features such as content and genre, language and script, such clusters of literary manuscripts can contribute to scribal profiles of the anonymous scribes of the Dead Sea Scrolls, which, in turn, can shed new light on ancient Jewish scribal culture, in Hebrew and Aramaic, in the Graeco-Roman period. Here, we first tackle the palaeographic identification of these unknown scribes.
2 Materials and methods
In this section, we provide descriptions of:
the dataset and the image preprocessing techniques(2.1),
the primary analysis for textural features using pattern recognition techniques, for allographic features using artificial neural networks and a combination thereof (2.2),
the second-level analysis using a different shape feature and performing statistical evaluation of the quality of the primary analysis (2.3), and
the third-level post-hoc visual analysis (2.4).
Additional details and descriptions can be found in the supplementary materials.
2.1 Dataset and image preparation
In this study, we have used digital images of 1QIsaa kindly provided to us by Brill Publishers . There are images in the Brill scrolls collection with varied resolutions from by pixels to by pixels, approximately. For 1QIsaa, we have images for columns 1–54 except for columns 16 and 46 (instead, columns 15 and 47 appear twice in the Brill collection). The list of scan numbers and their corresponding column numbers are attached in the supplementary material B. For the second-level analysis, we have also used the most recent digitized multi-spectral images of the Dead Sea Scrolls, kindly provided to us by the Israel Antiquities Authority (IAA); these images are also accessible on their Leon Levy Dead Sea Scrolls Digital Library website . Although the IAA images do not contain any newly digitized version of 1QIsaa, we have used this vast collection to extract dominant character shapes and produce self-organizing feature maps (see section 2.3).
The images of 1QIsaa pass through multiple preprocessing measures to become suitable for pattern recognition-based techniques. Our first step in preprocessing is the image-binarization technique. In order to prevent any classification of the text-column images on the basis of irrelevant background patterns, a thorough binarization technique (BiNet) was applied, keeping the original ink traces intact . After performing the binarization, the images were cleaned further by removing the adjacent columns that partially appear on the target columns’ images. Finally, few minor affine transformations and stretching corrections were performed in a restrictive manner. These corrections are also targeted for aligning the texts where the text lines get twisted due to the degradation of the leather writing surface (see Fig.2). A more detailed explanation of image preparation can be found in the supplementary material C.1.
2.2 Primary analyses: Feature-space explorations
In order to represent the handwriting of 1QIsaa, we applied feature extraction methods on the binarized cleaned images to translate the handwriting style into feature vectors. The data relates directly to the tangible evidence of the ink traces in the scrolls, ink penned by scribes. As writing is a moving process that involves muscle movements of the hand and arm it is determined by the rules of physics and can therefore be quantified.
Our feature extraction methods correlate the ink traces with the hands of the scribes on multiple levels. The allograph level of the whole character shape is easier to communicate to an audience, whereas the micro-level of textural features, such as Hinge, stands further away from the traditional visualization in the form of a palaeographic chart showing the whole character shape. Nonetheless, all these levels are equally directly related to the writing activity of ancient scribal hands that penned the ink on the scrolls.
The question regarding 1QIsaa whether there are different scribes or one scribe was communicated to the researcher performing the primary analysis but no further information about the state of the art regarding this question in scrolls studies (see section 1) was communicated.
The primary analysis involved three steps.
Textural feature extraction using pattern recognition techniques
Allographic feature extraction using artificial neural networks
Adjoined feature (a weighted combination of both textural and allographic features)
Step 2. After extracting features from each of the column images, we measured the distance between the feature files using the chi-square distance. The chi-square distance is the distance between two histograms, namely and , both having number of bins. In our case, the histograms are the feature vectors. During the calculation, we normalize the histograms, i.e. their entries sum up to one. The name of the distance is derived from Pearson’s chi-square test statistics and the distance is defined as:
These distance files contain numbers which are relatively difficult to analyze without any reference distance. To solve this issue, we first move to clustering-techniques and then to probability curves. While clustering, we reduce the feature space into a three-dimensional space to facilitate the visualization of the feature vectors.
Step 3. A feature extraction method such as Hinge provides us with a large feature vector, containing hundreds of variables. Some features in the feature vector might not have a large influence on the result. Therefore, the dimensionality of the data can be reduced in such a way that the most important aspects of the data remain. One way to do this is using Principal Component Analysis (PCA). It transforms the data into n components that are independent of each other. Using PCA we go from multidimensionality to a three-dimensional space, and then inspect this three-dimensional plot to see if there is any significant movement of the point cloud.
In order to facilitate the decision-making process directly from the distance files (from step 2), one typical approach is to analyse probability curves; a False Acceptance Rate (FAR) curve (the likelihood that the system will incorrectly accept a writer) and a False Reject Rate (FRR) curve (the likelihood that the system will incorrectly reject a writer). These curves are generated from a known set of writers to incorporate all the variabilities. Depending on the distance between two feature vectors, the probability of being the same or a different writer can be determined. Unfortunately, in the Dead Sea Scrolls collection, there is no certain identification of known writers. In this study, we have avoided to introduce into our algorithm any assumptions by palaeographers about scribal identity or difference in the scrolls in general or in 1QIsaa specifically. This procedure ensures the outcome of this study to be independent from any bias.
Instead of being able to use probability curves, robust alternative techniques are needed for the Dead Sea Scrolls. In order to cross-check and test the quality of our findings from the primary analysis, we have used statistical evaluation as second-level analysis.
2.3 Secondary analyses: Statistical evaluations
The second-level analyses’ goal is to independently assess whether there is a transition of style in the sequence of columns. The suspicion that there is a transition in the series of columns was communicated to the researcher performing this cross-check. However, until step 5, no more specific information was given about the sequence of columns where a style transition was observed in the primary analysis. The logistic tests performed in this part of the study were not influenced by any column information. This procedure ensures the independence of the second-level cross-examination.
The second-level analysis involved five steps; more detailed descriptions can be found in the supplementary materials D.
Step 1. In order to use a shape feature that is very different from those used in the primary phase of the study, it was proposed to use a fraglet approach, the so-called fragmented-connected component contours (fco3) [51, 40, 52]. In comparison to textural features that are concentrated around micro-details along the ink trace, fraglets contain more allographic information that may be understandable to a paleographer.
Step 2. A large Kohonen self-organizing feature map (SOFM) was computed, containing centroids for such fraglets from the total IAA multi-spectral images collection that is at our disposal, yielding prototypical fraglets. About randomly selected fraglets were used for this stage. Each centroid is based on about fraglet instances. The use of the Kohonen map is not essential. Other clustering methods can be used; this step is not critical. But the Kohonen map has the advantage that the centroids that end up in the map change gradually, as opposed to a haphazard result of the ordering of centroids in, e.g., a k-means algorithm.
Step 3. For the series of columns, a histogram was computed for split-scan samples and , separately. In digital paleography and forensic handwriting this approach is used in order to check a reasonable response of the algorithm. It is expected that version and of a column of text should be close neighbours, under the assumption that a column was produced by a single scribe. If a hit list of neighbours for a query of a column does not return the corresponding version in the top of the hit list of a search operation, results should be judged critically. Conversely, if the corresponding sample appears at the top, the neighbouring hits will also have a larger probability of being produced by the same scribe .
Step 4. For each sample, the nearest neighbours were computed in the rest of the list. Bookkeeping was performed on the distance in feature space and the column number of the hits that were found.
Step 5. From the computed data (from steps 1-4), i.e., the distances and the column numbers of the nearest neighbour samples, four follow-up steps can be taken that help to determine whether the handwriting style is uniform throughout the manuscript of 1QIsaa or whether there are style differences.
5a. For testing the deviation of a random voting pattern for left-vs-right neighbours of a given column in fraglet-shape space, a Chi-square test was used. If there is a single signal source (scribe), nearest neighbors will fall to the left or right of a column in the series in a random pattern.
5b. A one-way analysis of variance, a t-test, was performed on the distance values of the left versus right nearest-neighbour matches in the series of columns.
5c. Apart from the distance between columns in fraglet shape space, it is interesting to check the estimated position of a best-matching neighbour column for any given column in the series. If there is a single scribe, the nearest neighbour would appear in any column in the scroll. Conversely, if there are two scribes, the columns on the left would tend to have their best-matching neighbours on the left, and vice versa.
5d. If there is a phase transition in the sequence of columns, fitting a logistic curve on the variable ‘average neighbour position’ over columns should reveal the switching point reliably, i.e., with a high Pearson correlation of the fit. The number of the critical phase-transition column is the output of this test.
2.4 Tertiary analyses: Post-hoc visual analyses
The aim of the post-hoc visual analyses was to attempt to correlate the quantitative analyses from pattern recognition and artificial intelligence techniques with a qualitative analysis from a traditional palaeographic approach.
The third-level analysis involved three steps.
Step 1. For visual inspection by palaeographers, we created charts with full character shapes for individual Hebrew letters that can be found in the supplementary material E.
Step 2. In order to facilitate the complex process of visual inspection, we generated heatmaps for each character shape. The heatmaps are aggregated visualizations of the shape of each letter. These are made up of all particular instances of a letter and as such do not exist in one particular form. Thus, the use of heatmaps fulfils, through a sophisticated and robust procedure, the requirement from forensics to study each particular instance of a character. Also, the visualization by heatmaps may be an important step forward because they could work better than the palaeographic charts used traditionally in the field as they are not limited to one or more particular examples of which the indicative value can be doubtful but are made up of all instances of a letter.
Step 3. Suppose the primary analyses’ results and the statistical tests in the second stage would turn out significant. In that case, a post-hoc visual analysis of the fraglet set contributing best to the discrimination between the left and right parts of the sequence is required to bridge the quantitative and the qualitative approaches. The fraglets refer to the characters’ parts that can be more precise, distinctive, and informative in finding significant shape differences than the full characters. For each of the fraglet shapes, exploration can be performed to identify their significance in separating two halves (if there exists a separation). Then, by running all possible combinations of fraglets and counting their presence in each image, a statistical view of the two halves can be obtained.
3.1 Primary analyses
Here we present the plots that result from the three types of feature extraction techniques that we used and the distance measurements between the feature files using the chi-square distance. The plots have been examined to find any possible clustering or any significant movement of the points’ cloud in the columns of 1QIsaa. We used the PCA technique on each of the feature collections and plotted them in a three-dimensional visual space (see Fig.3). Figure 3 shows the red points for each of the columns of 1QIsaa.
In the next step, we used the colour red for columns 1–27 and the colour green for columns 28–54. Please note that this colouring works just as a label and has no effect/consequence on the experiments. The plots were then generated again for all the three types of features (see Fig.4, Fig.6, and Fig.7).
Figure 4 shows the plot using Hinge feature vectors on the full column images of 1QIsaa. There is a separation between the two sets of columns. Except for an outlier (column 29), the red and green points can be separated using a two-dimensional plane (similar to a piece of paper). This is visualized in Figure 5. The implication is that there might be a clear separation of the two sets of data, yet they are also close to each other.
As for column 29 appearing as an outlier in this part of the primary analysis, in the independent second-level analyses (see section 3.2), column 29 does not show up as a clear outlier. Also, in the primary analysis, column 29 is not an extreme outlier. Instead, it is close to the separation line of the two halves of the manuscript. Further tests can be performed in the future to conclude on a concrete reason for this.
Figure 6 shows the plot for the Fraglet feature from a Kohonen SOFM. Here, the points are not that clearly separated as in the case of the Hinge feature. The reason might be because the Fraglet feature renders the physical shapes of characters, similar to what the human eyes see, it is less adequate (in this particular case) to determine any micro-level differences in the data.
In the last step, we combined both these features, Hinge and Fraglet. Figure 7 shows the plot for the combined feature, or Adjoined feature. A clear separation is visible here between the data points in the adjoined feature plot.
Thus, the primary analysis indicates a significant difference between the two halves of the columns of 1QIsaa with a visibly clear separation in the points’ cloud of features.
3.2 Secondary analyses
Steps 1–4 (described in section 2.3) are pre-requisites to perform the tests in step 5. A detailed description of the first four steps can be found in the supplementary material D.1. It is important to note here that the fraglet features (fco3) used in the secondary analyses are derived from a different Kohonen SOFM than those used in the primary analyses (supplementary materials C.1.2 and D.1). This is done to ensure the independence of two analyses and to perform cross-validation. The results of the statistical tests conducted in step 5 of second-level analyses are as follows.
Step 5a. Figure 8 shows the pattern of statistical probability that the left/right voting pattern deviates from random. A clear dip is present at the middle of the graph, confirming that at that point, the probability of nearest neighbours of a column falling to the left or right is very likely not an accident. This analysis, however, would be considered exploratory, and not a rigorous test, due to multiple testing over several time windows. Therefore, additional testing was done on the basis of the pattern of distances of columns to their nearest neighbours in shape space.
Step 5b. The average distance from query column to best match is () on the left vs () on the right, , which is significant at . The inter-column distances are somewhat higher in the left series as compared to the right part, but their regularity is higher, given the lower standard deviation.
Step 5c. Figure 9 shows the obtained column position of the best fitting neighbour for a column. Visually, from the smoothed curves, it can be seen that left of column 27, the average position of the hits is between column 20 through 25. On the right of column 27, the average position of hits is between column 30 and 35. A t-test indicates that the average nearest-neighbour column number for a column on the left is at column 24 (), for a column on the right it is position 32 (), where (see supplementary material D.2). Therefore the between-column similarity is highest ’ipsilateral’ with respect to the cut point (column 27): ’left’ looks like left, ’right’ looks like right.
Step 5d. The results from steps 5a-5c are visually and statistically clear, but the valid question may be asked whether the actual point, i.e., the column number of a phase transition can also be computed? The logistic function or Fermi-Dirac function is usually used to model phase transitions in physics and biology. In the humanities, it can be used to model language change [54, 55]. Two types of analysis were performed to estimate the parameters of the logistic model:
where is the column number, is the estimated average position of its nearest neighbour in the column series as measured in the fraglet shape space. Parameter represents the vertical offset, represents the scale factor, represents the steepness of the phase transition and represents the column number where the phase transition occurs.In order to be very sure that a solution for the transition point is not haphazard, we will perform two very different estimation procedures for the logistic function. First, in order to allow a list of high-quality model fits to evolve without constraints, we used a Monte-Carlo estimation, randomly varying parameter values and remembering the best solutions. This sampling approach allows good results to emerge, without theoretical assumption. The second method is the more traditional curve-fitting approach that uses the ’least-squares error’ as the assumed constraint, to deliver a single best-effort solution. Without seeding a logistic function estimator with knowledge concerning the suspected column number 27, the output of the Monte-Carlo estimate can be found in Table 1:
In Table 1, the value of means that the transition column is estimated to occur between column 27 and 28, with a transition steepness that is smooth lasting from column 24 to 32 (Fig 9). The fit of the sigmoid transition model is significant, with a correlation r that equals 0.74 () on the raw data. An exact fit would have yielded r=1.0. Although not perfect, r=0.74 would be considered as a very robust correlation in empirical disciplines such as psychology and biology. The model would explain 55% of the variance in the data, which is not strange, given the fact that the logistic model is a stylized description of a time sequence with irregularities. If we smooth the irregularities over time, using a running average over a limited 3 or over 5 samples (columns), to smooth out the within writer variation, the correlation with the sigmoid increases considerably: if we smooth the column time series over 3 values, (76% var. explained variance by sigmoid phase transition); if we smooth the column time series over 5 values, r=0.93 (86% var. explained variance by sigmoid phase transition).
As a double check, the Monte Carlo-based fit was replicated with a more traditional least-squares curve fit (Python scipy package), yielding a phase transition at column 26.6 for raw data, with , at 26.2 for a smoothed time series with a window of three points () and a transition at column 26 for a smoothed time series with a window of five points (). This double-check indicates that both traditional curve fitting and stochastic model fits yield a transition around the middle columns of 1QIsaa. Interestingly, also the quality of fit (correlation) is similar for these two very different estimation methods, adding to the trust in the found transition point.
Thus, the second-level analyses confirm the presence of two different clusters in writing style in a series of handwritten columns, to be called left and right. The confirmation occurs in three different ways:
left/right votes for the relative serial position of the nearest neighbour of a column,
distance to nearest neighbours on the left or right, and
average serial column position of the nearest neighbour of a given column in fraglet-feature space.
The results from these analyses show that a transition point occurs at around column 27, although the obtained logistic model fit in step 5d suggests that the transition may not be completely sharp.
3.3 Tertiary analyses
The results for our attempt to correlate by visualization the quantitative analyses from pattern recognition and artificial intelligence techniques to the level suitable for palaeographers to be able to see what the quantitative analyses ‘see’, in this case a clear separation in style, are as follows.
Step 1. Our charts with full character shapes for individual Hebrew letters improve significantly on the traditional palaeographic chart, such as in . Each instance of a character can be directly traced back to its exact position in the manuscript of 1QIsaa. Also, there is no modern human hand involved, either in retracing the characters or in character reconstruction. The ink traces are extracted as is from the digital images and retain the movements once made by the ancient scribe’s hand (see supplementary material E).
However, as described in Section 1, due to the large number of characters from each column and the number of columns, the decision-making process from visual inspection alone of such charts may prove inadequate.
Step 2. A character heatmap is the normalized average character shape of individual letters extracted from the column images and aligned on their centroids (see Figure 10). The heatmaps are neither dependent nor produced from the primary and secondary analyses (subsections 3.1 and 3.2). They are entirely independent of pattern recognition and artificial intelligence-based tests. We present these heatmaps to produce an easy-to-use visualization for the palaeographers to observe any differences between letters coming from different columns.
We generated three different heatmaps for each letter, corresponding to the three aggregate levels for all columns of 1QIsaa, for columns 1–27, and for columns 28–54 (for some examples, see Figure 11). Though the full-character shapes from Figure 11 seem to exhibit not that much differences among them, a close inspection reveals subtle differences between the two halves of 1QIsaa. These differences can be observed in the thickness of strokes and the positioning of connections between strokes. See, for example, the subtle difference in positioning of the left down stroke and the right upper stroke vis-à-vis the diagonal stroke of aleph and the slight difference in thickness of the diagonal stroke, or the slight difference in thickness and length of the horizontal stroke of resh (see Figure 12).
In a traditional palaeographic chart such differences might be deemed insignificant and explicable as normal variations within the handwriting of one writer. If that were the case, i.e., what we see is normal within writer variability, then for 1QIsaa one would expect the same distribution of writing style across all columns, which is not the case. Rather, the primary analyses as well as the statistical tests (5a–5d) indicated a significant separation and a clear distribution of the two halves of the manuscript of 1QIsaa on either side of the divide.
Heatmaps should be inspected with a different understanding. Heatmaps are different from traditional palaeographic charts in that they represent the aggregated visualizations of the shape of each letter, hundreds per letter in the case of 1QIsaa. Given the large number (count) of samples and the fact that the center position estimate is stable, then the remaining differences after averaging are an indication of an underlying structural difference. Thus, in heatmaps, the subtle differences we see between the different aggregate levels are indicative if the separation between the different levels has also turned out significant otherwise, which is the case for 1QIsaa.
Note, that we have only used the automatically recognized characters to generate the heatmaps from the columns of 1QIsaa. The number of generated alephs for the heatmaps is 758, while the total number of alephs in 1QIsaa is 5011. These 758 alephs were automatically extracted by the computer on the basis of known shape structures, and the extracted characters come from all columns, representing a general distribution. This extraction is extremely efficient and has the advantage that it does not require human intervention. Our goal is not to produce an exhaustive enumeration of all alephs in the manuscript, rather than produce heatmaps that cover all columns with a sufficient number of examples. Therefore, the heatmaps presented here are robust enough to indicate the differences (previous studies can also back this claim ). To demonstrate the robustness: with the current number of alephs, any pixel of the heatmap with mid-intensity (here, orange with intensity, band to , and total intensity being ) has a probability of for that one pixel to give different results. So even if we were to increase the number of instances of a particular character, the resulting heatmap will not change significantly (it is possible to request heatmaps from all the individual characters by emailing the corresponding author).
Step 3. After having found statistically significant differences in the neighbourhood structure for columns in the scroll, and after having confirmed that a transition occurs at about the middle of the column series, a more detailed analysis is warranted. Please note, that the actual evidence for the differences comes from the primary and secondary analyses, whereas the current focus is illustrative only. The statistical differences obtained are the result of many small textural and allographic differences. For these allographic differences it is also important to keep in mind that an exhaustive list of possible allographs is not required: an allographic codebook approach will work very well, if it is sufficiently diverse . In the current problem, some of the allographs appear to be more different in their occurrence over the left and right columns, and we can take a look at them for illustrative purposes, while remembering that this concerns partial evidence from the extremes of the distribution.
For the fraglet feature, a selection was made of the most informative fraglets that are able to discriminate between the leftmost () and the rightmost columns () in the series. Please note, that the number of fraglets in the SOFM is . From these fraglets, we automatically generated sets of fraglets to visualize the differences between the two halves of the manuscript. Thus, we ran tests with thousands of combinations of fraglet sets, each providing a new overview. Figure 13 and Figure 14 show such an overview for the relevant columns (more images can be found in supplementary material E). Below each column thumbnail, the left blob indicates the ground truth (‘left series’ is green, ‘right series’ is red), whereas the colour immediately to the right of it shows the colour that the subset of most-informative fraglets predicts. These figures illustrate the statistical view of a separation between the two halves of the manuscript.
4 Discussion and conclusions
The aim of this study was to tackle the palaeographic identification of the unknown scribes of the Dead Sea Scrolls, exemplified by 1QIsaa. The question for 1QIsaa was whether subtle differences in writing should be regarded as normal variations in the handwriting of one scribe or as similar scripts of two different scribes and, if the latter, whether the writing of the two scribes coincides with the two halves of the manuscript. The evidence collection was presented in a chronological manner.
Firstly, an independent observation was made that in feature spaces, the left and right part of the column series, ended up in different regions. Several feature methods confirmed this observation. The preferred explanation is that there were two main scribes responsible for copying 1QIsaa, their work indeed separated between columns 27 and 28 by a three-line lacuna at the bottom of column 27. We see that there is a clear separation between the data points in both the Hinge and the Adjoined feature plot (Fig.5 and Fig.7). If we consider an explanation in terms of a large variability within one single scribe, then the question remains why the points are not randomly scattered (between the two sets of columns) on the PCA space in the adjoined feature plot. Instead, there is a clear indication of separation, at least from one of the angles of the plot space. Therefore, a more likely scenario is two different scribes working closely together and trying to keep the same style of writing yet revealing themselves, their individuality, in the textural feature space.
Secondly, a series of tests was performed on a separate shape feature, a Kohonen map of fragmented contours. A series of five questions was asked, starting with a statistical test whether the pattern of neighbours on the left or right of any given column deviates from the expected random pattern for the case of a single writing style. Because these tests clearly show that the neighbourhood structure is not random, additional analyses were warranted. The distances between columns, as measured in the fraglet-usage space, also showed a highly significant pattern. Finally, the serial column number for the nearest neighbour of each column shows a distinct transition at about the middle of the column series in the scroll. Fitting a logistic model delivered an estimate of the region where this transition occurs, i.e., around column number 27. This point is found without coercion, and emerges from two very different quantitative approaches (a least-squares and a separate Monte-Carlo analysis) on the time series of the column numbers of nearest neighbour matches, for each column. In simple terms: columns on the left clearly tend to yield nearest neighbours on the left, columns on the right clearly tend to yield nearest neighbours on the right. Therefore, these secondary analyses confirm the suspicion raised on the basis of the exploratory primary analyses by other researchers in the team.
Thirdly, our fully-automatic generation of charts with full character shapes for individual Hebrew letters extracted from the digital images of the ancient manuscript of 1QIsaa greatly advances how palaeographic charts have been previously produced, while the subtle differences visible upon close inspection post-hoc of the heatmaps (both thickness and angular differences exist) also show that the use of heatmaps can help to bridge the quantitative analyses and traditional palaeography. Moreover, a post-hoc visual analysis on the most discriminative fraglets in the Kohonen ‘bag of visual words’, which is now allowable given the obtained statistical significance of differences between ‘left’ and ‘right’ in other measures, illustrates the transition point and the differential evidence by colour-marked fraglets in the column images. To be sure, the reverse is also true: if there were no statistical significance of differences between ‘left’ and ‘right’, then it would not have been allowable to look for evidence of difference in post-hoc visual analyses.
Yet, there are at least three variables that we need to be transparent about because these may affect the results in unknown ways. These three variables are: material degradation, writing implements, ink deposition and writing conditions, and limitations on character extraction.
Regarding material degradation, we have to keep in mind that the scrolls, and by extension the images that constitute the data for our pattern recognition and artificial intelligence techniques, have degraded over the centuries and are not anymore in the shape they were once produced. This degradation causes an amount of uncertainty over the derived results, even though we tried our best to extract the original characters using state-of-the-art methods.
Writing implements and writing conditions can have significant impact on the outcome of the copied scrolls. The use of writing implements could differ in the cutting of the pen’s nib and writing conditions could change in the course of time . Although there is no evidence that different writing implements were used in 1QIsaa or a change in writing conditions occurred, the point is that the specific writing implement or a change in writing conditions have an effect on the ink deposition, which in turn affects our modern extraction process of the original characters.
Finally, regarding limitations on extraction, note that character extraction can never be perfect. Nevertheless, we are confident with our methodology, and it clearly shows excellent extraction results, both qualitatively and quantitatively. Additionally, our feature extraction methods are tested on an independent dataset: ‘Firemaker image collection for bench-marking forensic writer identification’ . Furthermore, the statistical tests are methodologically robust and independent of the data they are tested on.
The discussion of these variables is not to cast doubt on our study’s outcome, which remains inherently sturdy. However, they remind us that the techniques from pattern recognition and artificial intelligence do not give certainty of identification but statistically proven probabilities that can help the human expert understand and decide between different possibilities. One key point to highlight here: this research is by far the most comprehensive and elaborate study on writer identification on historical manuscripts using state-of-the-art computer-based techniques. The use of feature extractions on both macro- and microlevels of character shapes is extensive, gauging a writer’s mimetic (cultural) and genetic (bio-mechanical) traits, respectively. The methods used here are rooted in earlier work in forensic writer identification [52, 53, 58, 59]. The minimal use of human interference, the cross-checks and re-validation through statistical tests make this study unique and lay the foundation for future advanced studies.
Thus, the conclusion is that the use of robust pattern recognition and artificial intelligence techniques is a breakthrough for the palaeography of writer identification in the Dead Sea Scrolls. We have demonstrated that despite the near uniform handwriting there is a clear separation between two writing styles in 1QIsaa and that the separation between the two styles coincides with the codicological separation between columns 27 and 28 and the two halves of the manuscript, columns 1–27 and columns 28–54.
The similarity in handwriting between different scribes can indicate a common training shared by the scribes, perhaps in a school setting or otherwise close social setting, such as in a family context a father having taught a son to write. For five documentary texts it has been suggested that the similarity in script may be the result from a common school training . We have otherwise no concrete evidence for such schools but their presence must be presumed [4, 60, 61] .
Our conclusion for 1QIsaa that there were two main scribes also sheds new light on the production of biblical manuscripts in ancient Judea. We have provided new, tangible evidence that such texts were not copied by a single scribe only but that multiple scribes could closely collaborate on one particular manuscript of a text that would come to be regarded and revered as biblical.
The research for this article was carried out under the ERC Starting Grant of the European Research Council (EU Horizon 2020): The Hands that Wrote the Bible: Digital Palaeography and Scribal Culture of the Dead Sea Scrolls (HandsandBible 640497), principal investigator: Mladen Popović. The authors owe a special debt of gratitude to Eibert Tigchelaar, Drew Longacre, and Gemma Hayes who responded to an earlier draft of this article. For the images of 1QIsaa from the Brill collection we are grateful to Brill Publishers. For the high-resolution, multi-spectral images of the Dead Sea Scrolls we are grateful to the Israel Antiquities Authority (IAA), courtesy of the Leon Levy Dead Sea Scrolls Digital Library; photographer: Shai Halevi. We are very grateful to the staff of the IAA Dead Sea Scrolls Unit for their help and support.
Conceptualization: Mladen Popović, Maruf Dhali, Lambert Schomaker
Data curation: Mladen Popović, Maruf Dhali, Lambert Schomaker
Formal analysis: Mladen Popović, Maruf Dhali, Lambert Schomaker
Funding acquisition: Mladen Popović
Investigation: Mladen Popović, Maruf Dhali, Lambert Schomaker
Methodology: Mladen Popović, Maruf Dhali, Lambert Schomaker
Project administration: Mladen Popović
Resources: Mladen Popović, Lambert Schomaker
Software: Maruf Dhali, Lambert Schomaker
Supervision: Mladen Popović
Validation: Mladen Popović, Maruf Dhali, Lambert Schomaker
Visualization: Mladen Popović, Maruf Dhali, Lambert Schomaker
Writing – original draft preparation: Mladen Popović, Maruf Dhali, Lambert Schomaker
Writing – review and editing: Mladen Popović, Maruf Dhali, Lambert Schomaker
Appendix A Supposed scribal idiosyncrasies
The study in  suggests that there are nine scribal idiosyncrasies in both halves of the manuscript. Since almost all of the features concern scribal practices shared also by other scribes in the scrolls, the nine features listed by  are not scribal idiosyncrasies and therefore do not support there being one main scribe in 1QIsaa:
For other examples of writing of parts of words at the end of a line to be repeated in full on the following line for lack of space, see , 107–108;
For other examples of supralinear and infralinear writing at the end of a line, see , 108. Also, 1QIsaa columns 3 and 30 simply did not allow extending much beyond the left column margin because of the stitches connecting two sheets (and in column 45:10 the intercolumn space was already used up);
For a discussion of extending beyond the right column margin, see , 106;
Regarding the ligature samek and final pe occurring virtually only in \<ksp¿:
1. There are seven examples where this does not occur (cf. 1QIsaa 2:16; 32:17; 33:19; 39:11; 43:17; 45:19; 45:20) and five examples where it does occur (cf. 1QIsaa 7:14; 37:3; 40:15; 45:19; 49:20 [the last one is not listed by ]), sometimes in the same line (45:19);
2. The way in which samek and final pe are written in 1QIsaa 7:14 differs from the other four examples from the second half of the manuscript (especially in the horizontal upper stroke of samek and in the horizontal down stroke of final pe), whereas those four are written in the same way;
3. All other occurrences of words with samek and final pe are non-ligatured except for \<y’sp¿ in 1QIsaa 49:23;
For another example of starting to write the lamed too soon, see 4Q27 15:10;
For many more examples of crossing out words or letters, see , 198–201;
For a discussion of many other examples of cancellation dots, see , 188–198.
Appendix B Image information
|Scan number||Column||Scan number||Column||Scan number||Column|
|2162/SHR 7001||1||2181/SHR 7022||22||2209/SHR 7043||43|
|2163/SHR 7002||2||2182/SHR 7024||24||2210/SHR 7044||44|
|2164/SHR 7003||3||2183/SHR 7026||26||2211/SHR 7045||45|
|2165//SHR 7004||4||2184/SHR 7027||27||2212/SHR 7047||47|
|2166/SHR 7005||5||2185/SHR 7028||28||2213/SHR 7048||48|
|2167/SHR 7006||6||2186/SHR 7029||29||2214/SHR 7049||49|
|2168/SHR 7007||7||2187/SHR 7030||30||2215/SHR 7050||50|
|2169/SHR 7008||8||2188/SHR 7032||32||2216/SHR 7051||51|
|2170/SHR 7009||9||2189/SHR 7033||33||2217/SHR 7053||53|
|2171/SHR 7012||12||2190/SHR 7034||34||2449/SHR 7010||10|
|2172/SHR 7013||13||2191/SHR 7035||35||2450/SHR 7011||11|
|2173/SHR 7014||14||2192/SHR 7036||36||2451/SHR 7023||23|
|2175/SHR 7015||15||2193/SHR 7037||37||2452/SHR 7025||25|
|2176/SHR 7017||17||2194/SHR 7038||38||2453/SHR 7031||31|
|2177/SHR 7018||18||2195/SHR 7039||39||2455/SHR 7052||52|
|2178/SHR 7019||19||2196/SHR 7040||40||2456/SHR 7054||54|
|2179/SHR 7020||20||2197/SHR 7041||41||-||-|
|2180/SHR 7021||21||2208/SHR 7042||42||-||-|
Appendix C Supplementary material on primary analyses
c.1 Preprocessing: Binarization and alignment Correction
Our first step of preprocessing was binarizing the images of 1QIsaa.
It should be noted here that many modern deep-learning methods can be trained end-to-end with the 1QIsaa scroll without performing binarization, but this is not desirable for doing digital palaeography of the scroll. For example, a direct end-to-end solution on clustering the column-images of the 1QIsaa scroll can be achieved for writer identification, but there is always a risk of getting the solution for the wrong cause. For example, the decision of an artificial neural network may be based on spurious correlations with the texture of the parchment. Therefore, it is essential to extract only the ink traces (foreground) and no other material features in the images (background). There are several traditional methods for document binarization. Most commonly used ones are Otsu  and Sauvola . These traditional methods work quite well if the contrast between the ink and the background is relatively large. But for the 1QIsaa images this is not the case mostly due to degradation over time and the skin texture. It is, therefore, important to digitally extract the ink from the parchment. In this study, we use BiNet, a deep-learning-based method especially designed in Groningen to binarize the scrolls images. Instead of using a simple filtering technique, BiNet uses a neural network architecture for the binarization task and therefore yields better output .
After performing the binarization, the images need to be cleaned further. This cleaning is required to get rid of the adjoining columns that appear in each of the images of the target columns. This is an important step as it will ensure that every image corresponding to a particular column should contain the characters from that target column only. Following the cleaning process, rotation and alignment correction needs to be done as well. If the images are rotated with an angle to the horizontal axis, then this affects the feature calculations that are not rotation-invariant. The rotation correction will thus ensure that the lines of texts are aligned horizontally. After this step, in a few cases, minor affine transformation and stretching correction is performed in a restrictive manner. These corrections are also targeted for aligning the texts where the text lines get twisted due to the degradation of the parchment (Fig. 2). For the rotation and stretching correction, we have used the well-known GIMP tool, a free and open-source raster graphics editor (version 2.8.16) . Finally, we have split each of the columns vertically into half to further validate the tests.
Feature extraction: Texture-level
Textural methods capture statistical information on attributes of handwriting, like the curvature and slant of the contours of characters. As these methods look at the image as a whole, they do not require a dedicated segmentation technique. The statistical information in the feature vector represents the handwriting style of the document to be used in further analysis. As mentioned above, Hinge is a successful textural feature-extraction technique for the scrolls collection and we use it for 1QIsaa in our current study. Hinge is originally proposed in the work of Marius Bulacu and Lambert Schomaker .
The Hinge feature is a compact transformation of the handwriting that captures both the writing angle and trace curvature. The bio-mechanics (relative wrist/finger movement control) and allographic choice (the learned and preferred letter shapes) of the writer dictates the slant and roundness of the writing process. As Hinge captures these two basic parameters (slant and roundness), it comfortably succeeds in writer identification and verification.
The Hinge kernel calculates the joint probability distribution of the angle combination of two hinged edge fragments. The joint probability of the orientations is quantized into a two-dimensional histogram , where the angles and are the angles with respect to the horizontal plane, of the two arms of a hinged kernel that is convolved over the edges of a handwritten image. For actual calculations, the hinge can be slid along the contour of each connected ink component of writing. We use angles (: number of bins) for both and with a length () of (Fig.15). We only consider the angles that are smaller than (to get rid of redundancy), and we can exclude the cases in which (this is because if they are equal than it implies that they are indicating the same point and there is no useful angle involved). Finally, it results in a feature vector of dimensions.
Feature extraction: Allograph level with neural networks
The next type of feature we use is on the character-shape (allograph) level, namely the Fraglet. We will briefly explain how the Fraglets are formed. The connected components (mostly the full character shapes) from binarized images are fragmented to get more prototypical shapes from the scrolls collection.
Each fragmented contour (counter-clockwise traced) contains points yielding feature values (: position of each pixels). The contours are normalized to a centre of gravity at , with the radius emanating from that centre being normalized to an average of . This type of normalization is more stable than bounding-box normalization (Bounding-boxes refer to minimum rectangles containing the ink pixels of individual characters. They create more difficulties in normalization due to different and often arbitrary shapes of ink blobs.). We call these contours the Fraglets.
In order to extract these Fraglets, we used binarized images from the scrolls collection with the condition that the images need to have at least black pixels. This ensures automatic selection of images with a relatively high amount of writing. Finally, full plate images from the scrolls collection fulfilled this criterion. For the full plate images, we use the high-resolution multi-spectral images kindly provided to us by the Israel Antiquities Authority (IAA), which derive from their Leon Levy Dead Sea Scrolls Digital Library project .
Using the extracted Fraglets, we than form a Kohonen Map. As mentioned above, this is a self-organizing feature map (SOFM) that uses neural networks with neighbourhood function to preserve the topological properties of the Fraglets. The resulting SOFM contains cells with each cell containing features (Fig. 16). The number of cells in the SOFM is derived empirically by measuring the performance on the writer identification data from our previous study on the scrolls .
Once the Kohonen Map (SOFM) is formed, we then use it to calculate the Fraglet feature for 1QIsaa. For each of the images of the columns, we calculate a feature vector of the output histogram. We take a spread of counts for a Fraglet over nearest neighbours in the SOFM. We also calculate the cosine feature (and corresponding cosine SOFM file). This involves replacement of the normalized coordinates with cosine/sine pairs. This means coordinates become ) with phi representing the angle with the horizontal axis, for each coordinate point along the contour. Finally, it results in a feature vector of dimensions.
In order to take advantage from both the textural and allographic feature level, we use a third type of feature namely the Adjoined features. Adjoined features are the weighted combination of both Hinge and Fraglet. The adjoining results in a feature vector of 5365 dimensions preserving the handwriting style description from both feature levels.
Appendix D Supplementary material on secondary analyses
d.1 Kohonen map of fragmented connected components
In the so-called bag-of-patterns approach used here, a document is assumed to be characterized by the usage (occurrence) frequencies, i.e., the histogram of the fraglets, similar to the well-known bag-of-words approach in text analysis . The distance between such histograms is computed for pairs of document samples. The histogram is assumed to be a feature vector capturing the occurrence of small, prototypical shapes, such that an overall descriptor for the style of each document sample can be computed. Fraglets do not have to correspond to complete characters, they can be smaller or larger than that, and each is mapped to its best-matching centroid in the SOFM, which is guaranteed not to represent an outlier or singleton pattern due to the very large size of the training data.
d.2 Statistical tests on the fraglet feature distances
- Step 5a
- If the style were uniform, there should be no difference between the number of times a hit (nearest neighbour) is found on the left or on the right of a column number. Using the Chi-square test, the deviation from the expected frequencies can be computed. If it is more likely that a point on the left has a hit on the left in the sequence, or, vice versa, finding a hit on the right of a point that is on the right, then the distribution is not homogeneous. The window under consideration in the column series is varied from 9 to 26: big enough to catch hits, but smaller than the mid point of the sequence. The returned probability that the pattern of counts is non-accidental will be averaged and a graph will be plotted over the column numbers. A minimum or dip in the curve will be indicative of a column number where the voting pattern for left vs right column hits is not random. The common threshold of will be used to decide for such singular points. No information concerning a critical column number is used. Due to the dependent nature of the running time window of left and right votes for neighbors, additional testing is needed.
- Step 5b
- If the style were uniform over columns, the distances to the nearest neighbours on the left and right should be comparable (of similar value) over the column series. On the other hand, if there are style differences, the average value of the distance may change over the column series. For this, a one-way analysis of variance can be used, or a t test, with the categories left and right, for the leftmost and rightmost columns in the series, respectively. Also here, a windowed approach is used, where distances are computed over windows of size 18 to 26 columns and averaged. Please note that, similar to the approach in Question 5a, no information is used concerning a column where style transition may be supposed to occur.
- Step 5c
- If the style were uniform, we would expect the same average position for hits over the column series. Indeed, the average position should be in the middle of the column series. On the other hand, if there are style differences, the average estimated position per column would vary. In the case of a linear style development in the series, the estimated average position of hits would also vary linearly. If a sudden change in style occurs, alternatively, we would expect something like a ‘step response’, i.e., a discontinuity in the series.
- Step 5d
- Following 5c, if there is a phase transition in the sequence of columns, fitting a logistic curve on the variable ’average neighbour position’ over columns should reveal the switching point reliably, i.e., with a high Pearson correlation of the fit. The number of the critical phase-transition column is the output of this test.
________________________________________________________ One-way ANOVA Name N Mean SD Min Max Left 18 0.238 0.003 0.233 0.243 Right 17 0.231 0.008 0.220 0.249 Total 35 0.234 0.007 0.220 0.249 ________________________________________________________ Weighted Means Analysis: Source SS df MS F p Between 0.000 1 0.000 11.189 0.002 ** ________________________________________________________ t-test Name N Mean SD Min Max Group-1 18 0.238 0.003 0.233 0.243 Group-2 17 0.231 0.008 0.220 0.249 Total 35 0.234 0.007 0.220 0.249 ________________________________________________________ Weighted Means Analysis: t(33) = 3.345 p = 0.002
The distance of a column with a nearest neighbour is significantly different between matches found to the left and the right , with a slightly larger distance for the first half as compared to the second half . One-way anova and a t-test both return .
The average distances obtained if the queried column is on the left of the found column in the sequence (best hit is in the future). Between column 25 and 30, this distance drops, i.e., ‘future’ columns fit better. After column 35, the distance increases again.The mirror version, i.e., ‘best hit for a query is in the past’ also shows a transition between column 25 and 30. The pattern is a bit less clear but confirms the notion of an accident. Average column position of the best fitting neighbour for a column. Average position of best-fitting neighbours of a column, in the column series. Left of column 27, the average position of the hits is between 20-25. On the right of column 27, the average position of hits is between 30-35. The light blue line represents column 27. In case of a linear style development, the diagonal blue line should have been approximated. In case of no style development, the y-values should have been about constant.
One-way ANOVA for variable: position of nearest neighbour (column number) of a given DSS column. The nearest neighbour is computed in fraglet-histogram space (), for two groups: left (column 27) and right (column 27).
________________________________________________________ Name N Mean SD Min Max LEFT 27 23.847 4.008 16.312 35.688 RIGHT 27 32.023 3.689 25.625 38.688 Total 54 27.935 5.620 16.312 38.688 ________________________________________________________ Weighted Means Analysis: Source SS df MS F p Between 902.418 1 902.418 60.822 0.000 *** Within 771.519 52 14.837 ________________________________________________________ Again: p < 0.001
Also this analysis indicates that the between-column similarity is highest ’ipsilateral’ with respect to the cut point (column 27): left looks like left, right looks like right. The significance is so high that with three decimals, p appears as zero in the output of the statistics tool. We can safely say that . This is a rigorous alpha, similar to medical sample comparisons, i.e., a three stars result (***). In other words: the probability that this difference is the consequence of random fluctuation, is less than .
The midpoint for category ’left’ is at column 24.
The midpoint for category ’right’ is at column 32.
Finding the phase transition using a logistic fit
Assumption: in the column series there is a phase transition somewhere in the series. Indications for this came from Chi-square tests on the distributions of hits left—right of a target column. These tests indicates a switch at around column 27. When using this as the split criterion, a subsequent one-way anova revealed significant difference p ¡ 0.001 for columns on the left and right, who appear to have their nearest-neighbour hits on the left and right, respectively, consistent with the expectation of large similarity within a grouping. If we model the series as a phase transition, using a logistic function, will that transition occur at or about column 27?
Without seeding a logistic function estimator with knowledge concerning the magical number 27, this was the output:
__________________________________________________________________ xoff yoff amplitude steepness mc-logist-reg-predict 27.824583 24.094666 7.983507 0.924704 __________________________________________________________________
The value of xoff means that the transition column is estimated to occur between column 27 and 28, with a transition steepness that is smooth: In the separate .svg plot it lasts from 24-32.
Sigmoid regression analysis output:
______________________________________________________ Analysis for 54 cases of 2 variables: Variable sigmoid avgpos Min 24.0947 16.3125 Max 32.0782 38.6875 Sum 1514.0753 1508.5000 Mean 28.0384 27.9352 SD 3.8642 5.6199 ______________________________________________________ Correlation Matrix: sigmoid 1.0000 avgpos 0.7432 1.0000 Variable sigmoid avgpos ______________________________________________________ Regression Equation for sigmoid: sigmoid = 0.511 avgpos + 13.7632 ______________________________________________________ Significance test for prediction of sigmoid Mult-R R-Squared SEest F(1,52) prob (F) 0.7432 0.5523 2.6102 64.1608 0.0000 ______________________________________________________
The fit of the sigmoid transition model is significant, with a correlation r that equals 0.74 (p ¡ 0.001). An exact fit would have yielded r=1.0. Although not perfect, r=0.74 would be considered as a very robust correlation in psychology and biology.
The model would explain 55% of the variance in the data, which is not strange, given the fact that the model is a stylized description of a time sequence with irregularities. If we smooth the irregularities over time, using a running average over a limited 3 or over 5 samples (columns), to smooth out the within writer variation, the correlation with the sigmoid increases considerably:
If we smooth the column time series over 3 values, r=0.87 (76% var. explained variance by sigmoid phase transition)
If we smooth the column time series over 5 values, r=0.93 (86% var. explained variance by sigmoid phase transition)
d.3 Least-squares fitting (Scipy) of a logistic curve on the estimated average serial position of the nearest-neighbour of a column
The result of the Monte Carlo-based logistic model fit was replicated with a more traditional least-squares curve fit (Python scipy), yielding a phase transition at column 26.6 for raw data, with r=0.74, at 26.2 for a smoothed time series with a window of three points (r=0.87) and a transition at column 26 for a smoothed time series with a window of five points (r=0.94). Curve-fitting results are shown in Fig. 19, 20 and 21. Even without smoothing, the phase transition is clearly visible. With smoothing, the pattern is even more clear (window size 5, Fig. 21).
_________________________________________________________________ Raw Monte Carlo (mc:) and least squares (scipy:) results _________________________________________________________________ Xoffset Yoffset A steepness 27.824583 24.094666 7.983507 0.924704 < mc:try r=0.7432 28.243372 24.213305 7.994352 2.602412 < mc:RAW r=0.7363 27.727867 23.932599 8.002640 0.876055 < mc:RAW r=0.7432 27.634910 24.354421 7.740445 1.295146 < mc:RAW r=0.7414 27.882597 24.194399 7.872786 7.975232 < mc:RAW r=0.7365 26.605149 23.233309 9.102090 0.435106 < scipy:RAW r=0.7441 _________________________________________________________________ 27.9197 Avg,RAW 27.170642 23.515902 7.901666 0.751979 < mc:SMO3 r=0.8692 25.067162 22.548375 9.256635 0.362236 < mc:SMO3 r=0.8713 24.622219 22.198244 9.320780 0.468216 < mc:SMO3 r=0.8665 28.309579 23.505822 8.146873 1.383725 < mc:SMO3 r=0.8674 26.210565 23.064034 9.189271 0.407289 < scipy:SMO3 r=0.8728 _________________________________________________________________ 26.2924 Avg,SMO3 25.907792 23.278395 8.809249 0.572107 < mc:SMO5 r=0.9316 23.054408 22.920690 8.547156 0.335815 < mc:SMO5 r=0.9243 24.057952 22.826461 8.236999 0.341474 < mc:SMO5 r=0.9309 24.405605 22.058579 9.809585 0.280687 < mc:SMO5 r=0.9319 25.990002 22.951100 9.342036 0.389834 < scipy:SMO5 r=0.9360 _________________________________________________________________ 24.3564 Avg,SMO5
The least-squares approach gives one result. The Monte Carlo estimation is done a few times (1 hour of computing per fit). Although smoothing over 5 columns (SMO5) gives the highest value of the Pearson correlation, the smoothing also biases the estimation of the transition point. Therefore the estimate of the transition point for ’RAW’ data is to be preferred. The estimation yields a negative number for the Xoffset, this corrected here, to be consistent with Eq. 1.
Appendix E Tertiary analyses
The charts with full character shapes for individual Hebrew letters improve significantly on the traditional palaeographic chart. Each instance of a character can be directly traced back to its exact position in the manuscript of 1QIsaa. Also, there is no modern human hand involved, either in retracing the characters or in character reconstruction. The ink traces are extracted as is from the digital images and retain the movements once made by the ancient scribe’s hand. Figure 22 presents two such charts. It is possible to request charts from all the individual characters by emailing the corresponding author.
- Popović M. The manuscript collections: An overview. In Brooke GJ, Hempel C, editors T&T Clark Companion to the Dead Sea ScrollsLondon: T&T Clark. 2019; p. 37–50.
- Zahn MM. Beyond ‘Qumran Scribal Practice’:The case of the Temple Scroll. Revue de Qumran. 2017;29/110:185–203.
- Cotton HM, Yardeni A. Discoveries in the Judean Desert. Volume 27. Aramaic, Hebrew and Greek documentary texts from Nahal Hever and other sites: with an appendix containing alleged Qumran texts (The Seiyal Collection I); 1997.
- Wise MO. Language and Literacy in Roman Judaea: A Study of the Bar Kokhba Documents. Yale University Press; 2015.
- Tov E. Scribal practices and approaches reflected in the texts found in the Judean Desert. Brill; 2004.
- Tigchelaar E. In search of the scribe of 1QS. In: Paul SM, Kraft RA, Schiffman LH, Fields WW, editors. Emanuel: Studies in Hebrew Bible, Septuagint, and Dead Sea Scrolls in honor of Emanuel Tov. Brill; 2003. p. 339–352.
- Yardeni A. A note on a Qumran scribe. New Seals and Inscriptions, Hebrew, Idumean, and Cuneiform, ed Meir Lubetski. 2007; p. 281–292.
- Ulrich E. Identification of a Scribe Active at Qumran: 1QPs-4QIsa-11QM. Meghillot: Studies in the Dead Sea Scrolls. 2007;6:201–210.
- Allegro JM. The Dead Sea Scrolls. 1957;.
- Wise MO. Accidents and Accidence: A Scribal View of Linguistic Dating of the Aramaic Scrolls from Qumran. In: Wise MO. Thunder in Gemini and other essays on the history, language and literature of Second Temple Palestine. Sheffield: JSOT Press; 1994. p. 103–151.
- Alexander PS. Literacy among Jews in Second Temple Palestine: Reflections on the Evidence from Qumran. Hamlet on a Hill: Semitic and Greek Studies Presented to Professor T Muraoka on the Occasion of his Sixty-Fifth Birthday. 2003; p. 3–24.
- Golb N. Khirbet Qumran and the Manuscripts of the Judaean Wilderness: Observations on the Logic of their Investigation. Journal of Near Eastern Studies. 1990;49(2):103–114.
- Golb N. Who wrote the Dead Sea scrolls?: The search for the secret of Qumran. Scribner; 1995.
- Crawford SW. Scribes and scrolls at Qumran. William B. Eerdmans Publishing Company; 2019.
- Sirat C. Writing as handwork: a history of handwriting in Mediterranean and Western culture. 2007;.
- Papaodysseus C, Rousopoulos P, Giannopoulos F, Zannos S, Arabadjis D, Panagopoulos M, et al. Identifying the writer of ancient inscriptions and Byzantine codices. A novel approach. Computer Vision and Image Understanding. 2014;121:57–73.
- Davis T. The practice of handwriting identification. Library. 2007;8(3):251–276.
- Harralson HH, Miller LS. Huber and Headrick’s Handwriting Identification: Facts and Fundamentals. Crc Press; 2017.
- Derolez A. The palaeography of Gothic manuscript books: From the twelfth to the early sixteenth century. vol. 9. Cambridge University Press; 2003.
- Stokes PA. Digital approaches to paleography and book history: some challenges, present and future. Frontiers in Digital Humanities. 2015;2:5.
- Kahle P. Theologische Literaturzeitung. 1950;75:539.
- Kahle P. Die hebräischen Handschriften aus der Höhle. 1951;.
- Noth M. Eine Bemerkung zur Jesajarolle vom Toten Meer. Vetus Testamentum. 1951;1(1):224–226.
- Kuhl C. Schreibereigentümlichkeiten: Bemerkungen zur Jesajarolle (DSIa). Vetus Testamentum. 1952;2(1):307–333.
- Brooke GJ. The Bisection of Isaiah in The Scrolls from Qumran. In Alexander PS, Brooke GJ, Christmann A, Healey JF, Sadgrove PC, editors Studia Semitica: The Journal of Semitic Studies jubilee volume. 2005; p. 73–94.
- Trever JC. A Paleographic Study of the Jerusalem Scrolls. Bulletin of the American Schools of Oriental Research. 1949;113(1):6–23.
- Ulrich E, Flint PW. Discoveries in the Judaean Desert XXXII: Qumran Cave 1: II. The Isaiah Scrolls: Part 2: Introductions, Commentary, and Textual Variants. OUP Oxford; 2011.
- Justnes Å. The Hand of the Corrector in 1QIsa a XXXIII 7 (Isa 40, 7-8): Some Observations. Semitica. 2015;57:205–210.
- Brownlee WH. The manuscripts of Isaiah from which DSIa was copied. Bulletin of the American Schools of Oriental Research. 1952;127(1):16–21.
- Martin M. The scribal character of the Dead Sea Scrolls. The scribal character of the Dead Sea scrolls. 1958;.
- Kutscher EY. The Language and Linguistic Background of the Isaiah Scroll: 1QIsa. vol. 6. Brill; 1974.
- Giese RL. Further Evidence for the Bisection of 1QIsa. Textus. 1988;14(1):61–70.
- Cook J. Orthographical peculiarities in the Dead Sea biblical scrolls. Revue de Qumran. 1989;14(2 (54):293–305.
- Cook J. The Dichotomy of 1QIsa. Intertestamental essays in honour of Jósef Tadeusz Milik. 1992; p. 7–24.
- Pulikottil PU, Pulikottil P. Transmission of biblical texts in Qumran: the case of the large Isaiah scroll 1QIsa. Sheffield Academic Press; 2001.
- Williamson H. Scribe and Scroll: Revisiting the Great Isaiah Scroll from Qumran. In Clines DJA, Richards KH, Wright JL, editors Making a Difference: Essays on the Bible and Judaism in Honor of Tamara Cohn Eskenazi. 2012; p. 329–342.
- Longacre D. Developmental Stage, Scribal Lapse, or Physical Defect? 1QIsa’s Damaged Exemplar for Isaiah Chapters 34–66. Dead Sea Discoveries. 2013;20(1):17–50.
- Dhali MA, He S, Popović M, Tigchelaar E, Schomaker L. A digital palaeographic approach towards writer identification in the dead sea scrolls. In: Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods-Volume 1: ICPRAM. vol. 2017. Scitepress; Setúbal; 2017. p. 693–702.
- He S, Schomaker L. Beyond OCR: Multi-faceted understanding of handwritten document characteristics. Pattern Recognition. 2017;63:321–333.
- Bulacu M, Schomaker L. Text-independent writer identification and verification using textural and allographic features. IEEE transactions on pattern analysis and machine intelligence. 2007;29(4):701–717.
- Niels R, Vuurpijl L, Schomaker L. Automatic allograph matching in forensic writer identification. International Journal of Pattern Recognition and Artificial Intelligence. 2007;21(01):61–81.
- Dhali MA, Jansen CN, de Wit JW, Schomaker L. Feature-extraction methods for historical manuscript dating based on writing style development. Pattern Recognition Letters. 2020;131:413–420.
- Ciula A. Digital palaeography: using the digital representation of medieval script to support palaeographic analysis. Digital Medievalist. 2005;1.
- Faigenbaum-Golovin S, Shaus A, Sober B, Levin D, Na’aman N, Sass B, et al. Algorithmic handwriting analysis of Judah’s military correspondence sheds light on composition of biblical texts. Proceedings of the National Academy of Sciences. 2016;113(17):4664–4669.
- Faigenbaum-Golovin S, Shaus A, Sober B, Turkel E, Piasetzky E, Finkelstein I. Algorithmic handwriting analysis of the Samaria inscriptions illuminates bureaucratic apparatus in biblical Israel. PLOS ONE. 2020;15(1):e0227452.
- Shaus A, Gerber Y, Faigenbaum-Golovin S, Sober B, Piasetzky E, Finkelstein I. Forensic document examination and algorithmic handwriting analysis of Judahite biblical period inscriptions reveal significant literacy level. PLOS ONE. 2020;15(9):e0237962.
- der Kooij G V. Classifying early NW-Semitic scripts: A search for writing traditions by studying script as artefact. 12 Mainz International Colloquium on Ancient Hebrew (to be published); 29 Oktober–1 November 2015.
- Dhali MA, de Wit JW, Schomaker L. BiNet: Degraded-Manuscript Binarization in Diverse Document Textures and Layouts using Deep Encoder-Decoder Networks. arXiv preprint arXiv:191107930. 2019;.
- Lim TH, Alexander PS. Volume 1. In: The Dead Sea Scrolls Electronic Library. Brill; 1995.
- The Leon Levy Dead Sea Scrolls Digital Library;. https://www.deadseascrolls.org.il/.
- Schomaker L, Bulacu M. Automatic writer identification using connected-component contours and edge-based features of uppercase western script. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2004;26(6):787–798.
- Schomaker L, Franke K, Bulacu M. Using codebooks of fragmented connected-component contours in forensic and historic writer identification. Pattern Recognition Letters. 2007;28(6):719–727.
- Schomaker LRB BA Bulacu M. GIWIS v3.1 Groningen Intelligent Writer Identification System Documentation V3.1. 10.5281/zenodo.4051006.; 2011.
- Bod R, Hay J, Jannedy S. Probabilistic linguistics. Mit Press; 2003.
- Kroch AS. Reflexes of grammar in patterns of language change. Language variation and change. 1989;1(3):199–244.
- Schomaker L, Bulacu M, Franke K. Automatic writer identification using fragmented connected-component contours. In: Ninth International Workshop on Frontiers in Handwriting Recognition. IEEE; 2004. p. 185–190.
- Bulacu M, Schomaker LRB, Vuurpijl L. Writer Identification Using Edge-Based Directional Features. In: ICDAR ’03: Proceedings of the 7th International Conference on Document Analysis and Recognition. Washington, DC, USA: IEEE Computer Society; 2003. p. 937–941.
- Franke K, Schomaker L, Veenhuis C, Taubenheim C, Guyon I, Vuurpijl L, et al. WANDA: A generic Framework applied in Forensic Handwriting Analysis and Writer Identification. HIS. 2003;105:927–938.
- Schomaker L. Writer identification and verification. In: Advances in Biometrics. Springer; 2008. p. 247–264.
- Macdonald M. “Review of Wise 2015”. Journal of Roman Archaeology. 2017;30:832–842.
- Healey JF. “Literacy in Literate Societies’: The Scribe in Nabataean and other Aramaic Contexts”. Languages, Scripts, and Their Uses in Ancient North Arabia, ed Michael C A Mcdonald. 2018; p. 31–38.
- Otsu N. A threshold selection method from gray-level histograms. IEEE transactions on systems, man, and cybernetics. 1979;9(1):62–66.
- Sauvola J, Pietikäinen M. Adaptive document image binarization. Pattern recognition. 2000;33(2):225–236.
- Team TGD. Gnu image manipulation program - GIMP - version 2.8.6; 2016.
- Dillon M. Introduction to modern information retrieval: G. Salton and M. McGill.; 1983.