DEVELOPMENT WRITTEN TEST ASSESSMENT BASED ON INTEGRATED THEMATIC LEARNING FOR V GRADE STUDENTS OF ELEMENTARY SCHOOL

Development Written Test Assessment Based on Integrated Thematic Learning for V Grade Students of Elementary School. This was an integrated learning method for elementary school and to know the difficulty level, differentiation, distract, validity and reliability. The research populations were V grade students of Elementary School, Pringsewu District, Lampung Province, Indonesia. This research used research and development methods adapted with the steps according to Borg & Gall. The data collection tool used questionnaires and tests. Data analysis techniques by expert validation and test instruments were analyzed with validity, reliability, difficulty, differentiation, and distract. The validation assessment results, material and linguistic experts on question instruments were developed very well, and were worthy of use. The results of large-scale test of product development and the results of the instrument

[368] knowledge, and skills. Knowledge assessment as intended is an activity carried out to measure the mastery of students' knowledge.
The main objective of the assessment process in education is to interpret differences in student learning patterns, in accordance with article 4 point 1 in Permendikbud number 23 of 2016 namely, "Assessment of learning outcomes by educators aims to monitor and evaluate the process, progress of learning, and improvement of student outcomes continuously". Assessment can help teachers focus on efficient and ongoing teaching strategies. This is in line with Permendikbud number 23 of 2016 concerning Educational Assessment Standards, Article 1 point 1 stipulates that, "Educational assessment standards are criteria regarding the scope, objectives, benefits, principles, mechanisms, procedures and assessment instruments for student learning outcomes that are used as a basis in assessment of learning outcomes of students in basic education and secondary education ". Point 2 states that, "Assessment is the process of gathering and processing information to measure the achievement of student learning outcomes". Followed by point 4 which states, "Deuteronomy is a process carried out to measure the achievement of students' competencies on an ongoing basis in the learning process to monitor the progress and improvement of student learning outcomes".
The test is an instrument or tool in measurement. The test as a measurement instrument serves to reveal data and information about the object of measurement.
According to Saifudin (2016, p. 1), "Validation of research results is determined by the validity of the data, while valid data can only be obtained using a good test". The definition of the test was stated by Anastasi (in Saifuddin, 2016, p. 1), namely, "A psychological test is essentially the objective and standardized measure of a sample of behavior". Tests as measurement instruments must fulfill several characteristics. Saifudin said, "Important characteristics to be expressed as a good measuring tool, which is able to produce accurate data and information that is valid and reliable. The test also needs to have objectivity, standards, practical and economical ".
Tests are also interpreted as a number of questions that must be answered or statements that must be responded with the aim of measuring a person's ability level (Purnomo, 2016, p. 39). The cognitive ability test is one form of instrument that is widely used in various assessment activities. The test score is used as part of the basic decision-making of class increases at school. Integrated thematic learning as a concept is a learning approach that involves several subjects to provide meaningful learning experiences for children. "An integrated approach allows learners to explore, gather, process, refine, and present information about topics they want to investigate with the restrictions imposed by traditional subject barriers (Pigdon, 1992)Integrated learning is believed to be a practice-oriented approach learning that suits the needs of children. Integrated learning effectively will help create broad opportunities for students to see and build interrelated concepts.
This learning provides opportunities for students to understand complex problems that exist in the surrounding environment with a comprehensive view. Students in integrated learning are expected to have the ability to identify, collect, assess, and use information that is surrounding it meaningfully. It can be obtained not only through the provision of new knowledge to students but also through the opportunity to establish and apply it in a variety of new and increasingly diverse situations. The scope of integration in the 2013 curriculum includes; cohesiveness in subjects, integration between subjects, and cohesiveness beyond subjects. Strengthening steps occur in the learning process and assessment process.
The results of the questionnaire which some teachers responded to the midterm test questions at the 2013 Curriculum implementing schools in Pringsewu Regency, obtained data that: the test questions for midterm replication did not test all subjects contained in each theme, not all KDs and indicators in each the theme was tested, the test questions did not show integration, the language used had not used easy to understand, no answer sheets were provided for students, the teacher did not analyze the questions to find out the quality of the items. Based on this, a study was conducted on the development of the 2013 cognitive curriculum assessment in primary schools, focusing on developing written test assessments for formative tests based on themes.
The problem is how to develop integrated thematic-based written test assessments in class V elementary schools? Thus, the purpose of this study is to develop integrated thematic written test assessment in class V elementary schools that have different power levels, difficulty levels, deception, validity, and reliability.

B. Method
This research used Research and Development (R & D) method, adapted to the procedures and steps proposed by (Borg, 1983), 1) Research and information (research and information), 2) Planning (planning), 3) Develop Preliminary form of product (developing products initial), 4) Preliminary Field Testing (initial validation testing), 5) Main Product Revision (main product revision), 6) Main Field Testing (small scale field testing), 7) Operational Product Revision (operational product revision), 8) Main Field Testing (large-scale field testing), 9) Operational Product Revision (operational product revision), and 10) Dissemination and Implementation (deployment and application).
Sources of research data were experts, teachers, and students. Data collection techniques used a) questionnaire, and b) Test. Qualitative analysis on expert response, teacher response and student response, to determine the feasibility of the test developed. Quantitative analysis were to know the level of difficulty, different power, deceptive questions, validity and reliability. The sample of this study was the fifth grade students totaling 10 students for small scale trials (random sampling), fifth grade students of two Elementary School, totaling 36 students for large scale trials and class V the other Elementary School totaling 24 students for implementation testing.

Research result
The results of the response analysis of the questionnaire of teachers who carried out the 2013 curriculum on the instrument of midterm examination questions, was known that; items of the test questions for midterm replication did not test all subjects contained in each theme, not all competence base and iindicators in each theme were tested, test questions had not shown integration, the language used has not used good language which was clear and easy to understand, the teacher did not make an assessment rubric for each subject, the teacher did not analyze the problem to find out the quality of the item.
Based on the results of the above studies, various literature and journals were studied as the basis for the development of integrated thematic-based written test assessments. Design research and development to produce products, as well as test the effectiveness of these products. The initial design of the assessment before being validated contained summary of material on the subject matter of Islamic Kingdoms in Indonesia, question boxes, and question instruments. Each question in the integrated thematic-based assessment developed has text relating to the question to be asked, supporting images to help students understand the text, so that they can answer the question well. The number of questions is 40 multiple choices. The description of the material to be assessed is the theme of the History of Indonesian Civilization in the subthemes of Islamic Kingdoms in Indonesia. Quantitative analysis results of the instrument test were processed with ANATES.

Research Result and Discussion
Design validation from assessment experts, material experts, and linguists.
Validation experts to review the components of the assessment developed and to assess the assessment according to the indicators that have been determined. The results of validation experts stated valid and feasible to use, even with revisions. As for the results of the validation of the three experts, the assessment experts gave an assessment of 80.86% with very good criteria, material experts at 80% with good criteria, and linguists of 75% with good criteria, so that concluded that the assessment developed was valid and feasible. for use in small-scale assessment trials.
Enter and advice from experts about the weaknesses and shortcomings of the assessment, then try to reduce it by improving or revising the validated assessment instruments. The advice from assessment experts and material experts is on the choice of answers so that the short length of the sentence is the same. Suggestions from linguists are corrected from spelling, use of words, and sentences on the question instrument.
The small-scale field trial intended in this study was an assessment test for fifth grade students with a sample of 10 students and grade V elementary school teachers.
Small-scale assessment trials are intended to measure the legibility of the questions in the assessment that have been developed, so that suggestions and criticisms from this stage can be used as guidelines for revision of assessments. Trials at this stage are carried out by providing written test instruments and questionnaires to students and teachers.
Based on the results of student questionnaire analysis on small-scale trials giving a value of 74.5% with good criteria and questionnaire responses of teachers gave a value of 77.87% with good criteria, thus giving the conclusion that the assessment developed is feasible to be used in the assessment trial on a large scale.
The revision of the results of the small-scale trial was carried out in accordance with the input from students and teachers based on the legibility questionnaire. In small-scale trials, students and teachers provide input: (1) improvement of reference images in the problem; (2) the picture is clarified; (3) the description of the answer option should be flat. The large-scale assessment test aims to take data to find out the quantity of items including validity, reliability, distinguishing power, level of difficulty, and problem solving. The results of the analysis of the items are described as follows: Based on the results of the validity analysis the question shows that 35 questions are declared valid, and 5 questions are declared invalid. The percentage results in the table show that 87.5% of the questions are valid and 12.5% of the questions are invalid.
As for the percentage of the results of the validity of the item, the researcher poured in the diagram as follows: Diagram 1. Results of percentage analysis item Based on the results of the validity analysis of the question, it was concluded that 35 valid questions could be used for the use test. Based on the calculation of the reliability of the problem, the r11 value was 0.71. The value of r11 was consulted with the product moment rtable with a significant level of 5% and n = 36, obtained by the price of rtable0,329, where r11 was greater than rtable, it could be concluded that the questions in the integrated thematic-based assessment instruments were declared reliable. The results of the calculation of reliability are set out in the following  3,5,7,13,14,15,18,19,20,22,25,26,30. Good criteria as many as 25 questions, namely the number: 4, 6, 8, 9, 10, 11, 12, 16, 17, 21, 23, 24, 27, 28, 29, 31, 32, 33, 34, 4,6,8,9,10,11,12,13,14,16,17,18,19,20,21,22,23,24,25,26,27,28,29,31,32,33,34,35,37,… Based on the results of the problem analysis, it was found that the analyzed questions had a deceptive level that was very good, good, bad, bad, and very bad. For answer option A has 10 very good qualities, 5 good, 11 poor, 12 bad, and 2 very bad.
Answer option B has 18 very good, 10 is good, 7 is not good, 12 is bad, and 3 is very bad. The answer option C has 13 very good, 10 is good, 10 is not good, 4 is bad, and 2 is very bad. Answer option D has 10 very good, 14 good, 12 bad, 2 bad, and 2 very bad.
Option A has a criteria of 25% is very good, 12.55 is good, 27.5% is not good, 30% is bad, and 5% is very bad. Option B has a very good 45% criteria, 25% is good, 17.5% is not good, 5% is bad, and 7.5% is very bad. Option C has 35% oxygen very well, 25% good, 25% less good, 10% bad, and 5% very bad. Option D has criteria, 25% is very good, 35% is good, 30% is not good, 5% is bad, and 5% is very bad. The results are also stated in the following diagram:

Diagram: 4 Results of Effectiveness Analysis Tool
Revised assessment at this stage by improving the results of a large-scale assessment test if there are weaknesses and weaknesses seen valid and reliable questions so that it can be used in the use test to measure students' cognitive abilities.
Usage test is carried out to obtain data on students 'cognitive abilities in carrying out assessments, as an analysis of empirical evidence assessment developed whether they are able to measure the stages of students' cognitive abilities or not. Usage test is carried out by taking a sample of 24 students in SD Negeri 1 Banyuwangi to work on valid and reliable questions in a large-scale assessment test.
The results of the use test obtained data on students 'cognitive abilities in the form of values. This stage was carried out to package the results of assessment products including binding, so that the final assessment product can be realized in the form of hardfile containing 35 integrated thematic-based items to measure students' cognitive abilities.  Based on the responses of teachers and students obtained through questionnaires shows that integrated thematic-based written test assessments provide results with good and very good criteria, so that the conclusions obtained are that the assessment developed well and is suitable for use, this is in accordance with Taufiq's( 2015) that the results of the teacher and student response questionnaire on cognitive tests showed results that were in the valid category. Small-scale assessment test that aims to determine student / teacher assessment / responses to readability from integrated thematic-based assessments that have been developed. Students who were sampled at this stage were 10 students and the criteria for students used were students with low, medium, and high abilities based on their grades. The responses of students and teachers are sought by using a questionnaire to assess the readability of the assessment developed. The student and teacher response questionnaire in this study In addition to the student response questionnaire, there was also a questionnaire on teacher responses to find out the readability of the assessment developed. The results of the teacher's response to the assessment test (small scale) stated good. All aspects of the statement received a good response, except in item number 5 statement that the development of integrated thematic-based assessment contained questions referring to the learning indicators, getting declared very good. Results from the input of students and teachers in the assessment (small scale) were used as input into the product revision in the next stage, before the product is used in the second assessment phase of the assessment on a large scale.

Discussion
The large-scale assessment test aims to take data to find out the quality of the items which include validity, reliability, differentiation, and level of difficulty of the problem. The assessment of large-scale trial results is used in the use test to determine the cognitive abilities of students.
The purpose of a large-scale assessment trial is to look for teacher responses.
The results of the teacher response questionnaire recapitulation show that the assessment developed as a whole: (1)  This is in accordance with item item validity criteria from (Arikunto, 2012) (Purwanti, 2014)  The third item analysis is the level of difficulty of the problem. Judging from the level of difficulty, questions that are too easy do not stimulate students to solve them, while problems that are too difficult can cause students to despair quickly. So a good question is a problem that has a balanced level of difficulty, meaning that the problem is not too easy and not too difficult with a difficulty index or a problem in the medium criteria (Arikunto, 2009). In line with the research conducted by Taufiq (Ainul, 2015)shows that cognitive test questions are made from the level of homogeneous difficulty. . Based on the results of the analysis of the level of difficulty of the questions in the assessment developed, it shows that 21 medium criteria questions and 10 easy criteria questions and 9 criteria questions are difficult from a total of 40 questions. This is because the subject matter of Islamic Kingdoms in Indonesia is still in the category of material that is easy to understand so that the questions made also still measure students' knowledge in an easy, medium, and difficult level in the C1 to C3 cognitive domain.
The next item analysis is the distinguishing factor. The distinguishing power of the item is the ability of a question to distinguish high-ability students from low-ability students. Surya (2016) "The results of the discrimination analysis show that negative and difficult items range from easy to hard". Differentiating power analysis was carried out with the aim to find out the ability of the questions in the assessment developed to distinguish students who included clever (upper group) and students belonging to the lesser group (lower group 279) stated that "The item is good, the deceiver will be chosen equally by students who answer wrongly. On the contrary, items that are not good, the deceiver will be chosen unevenly ". Based on the results of the deception analysis, the questions analyzed have deceptions that are very good, good, not good, bad, and very bad, so that the problem can be used for usage testing.
The results of the analysis of the items which included validity, reliability, level of difficulty, distinguishing power, and deceptive questions showed that the assessment developed had fulfilled the content validity, while the results of expert validation analysis including assessment, material, and language experts showed that the assessment developed has fulfilled the construct validity, so it was concluded that the instrument developed was valid because it had fulfilled the content validity and construct validity.
Usage test is carried out to obtain data on students' cognitive abilities by working on 35 items in the assessment developed. The results of use tests are used in the analysis of empirical evidence assessment developed whether it is able to measure the cognitive abilities of students or not. empirical in the field and as supporting evidence of expert judgment. The results of the use and expert validation tests show that the integrated thematic-based assessment that has been developed is able to measure students' cognitive abilities.
The final product in this development research is an integrated thematic-based assessment to measure students' cognitive abilities with 35 items that are adapted to the learning indicators, complete with the question lattice, and answer keys. Integrated thematic-based assessment products to measure students' cognitive abilities have been through construction validity with three experts (assessment, material, and language), content validity includes analysis of items (validity, reliability, level of difficulty, differentiator and deception), and usage test as empirical evidence, so that the final assessment can proceed to the mass production stage if needed to measure the cognitive abilities of students in other schools.

D. Conclusion
The development of integrated thematic-based written test assessments on the theme of the History of Indonesian Civilization sub-themes of Islamic Kingdoms in Indonesia were used in class V elementary schools in Pringsewu District.
Integrated thematic-based written test assessment on the theme of the History of Islamic Civilization in Indonesia has met good criteria from the level of difficulty, different power, deceptive analysis, validity, reliability, so that it can be used to measure the cognitive abilities of fifth grade students